Reminder that I'm looking for review on https://github.com/apache/beam/pull/7160
On Thu, Nov 29, 2018, 11:48 AM Jeff Klukas <[email protected] wrote: > I created a JIRA and a PR for this: > > https://issues.apache.org/jira/browse/BEAM-6150 > https://github.com/apache/beam/pull/7160 > > On naming, I'm proposing that SerializableFunction extend ProcessFunction > (since this new superinterface is particularly appropriate for user code > executed inside a ProcessElement method) and that SimpleFunction extend > InferableFunction (since type information and coder inference are what > distinguish this from ProcessFunction). > > We originally discussed deprecating SerializableFunction and > SimpleFunction in favor of the new types, but there appear to be two fairly > separate use cases for SerializableFunction. It's either defining user code > that will be executed in a DoFn, in which case I think we always want to > prefer the new interface that allows declared exceptions. But it's also > used where the code is to be executed as part of pipeline construction, in > which case it may be reasonable to want to restrict checked exceptions. In > any case, deprecating SerializableFunction and SimpleFunction can be > discussed further in the future. > > On Wed, Nov 28, 2018 at 9:53 PM Kenneth Knowles <[email protected]> wrote: > >> Nice! A clean solution and an opportunity to bikeshed on names. This has >> everything I love. >> >> Kenn >> >> On Wed, Nov 28, 2018 at 6:43 PM Jeff Klukas <[email protected]> wrote: >> >>> It looks like we can add make the new interface a superinterface for the >>> existing SerializableFunction while maintaining binary compatibility [0]. >>> >>> We'd have: >>> >>> public interface NewSerializableFunction<InputT, OutputT> extends >>> Serializable { >>> OutputT apply(InputT input) throws Exception; >>> } >>> >>> and then modify SerializableFunction to inherit from it: >>> >>> public interface SerializableFunction<InputT, OutputT> extends >>> NewSerializableFunction<InputT, OutputT>, Serializable { >>> @Override >>> OutputT apply(InputT input); >>> } >>> >>> >>> IIUC, we can then more or less replace all references to >>> SerializableFunction with NewSerializableFunction across the beam codebase >>> without having to introduce any new overrides. I'm working on a proof of >>> concept now. >>> >>> It's not clear what the actual names for NewSerializableFunction and >>> NewSimpleFunction should be. >>> >>> [0] >>> https://docs.oracle.com/javase/specs/jls/se8/html/jls-13.html#jls-13.4.4 >>> >>> >>> On Mon, Nov 26, 2018 at 9:54 PM Thomas Weise <[email protected]> wrote: >>> >>>> +1 for introducing the new interface now and deprecating the old one. >>>> The major version change then provides the opportunity to remove deprecated >>>> code. >>>> >>>> >>>> On Mon, Nov 26, 2018 at 10:09 AM Lukasz Cwik <[email protected]> wrote: >>>> >>>>> Before 3.0 we will still want to introduce this giving time for people >>>>> to migrate, would it make sense to do that now and deprecate the >>>>> alternatives that it replaces? >>>>> >>>>> On Mon, Nov 26, 2018 at 5:59 AM Jeff Klukas <[email protected]> >>>>> wrote: >>>>> >>>>>> Picking up this thread again. Based on the feedback from Kenn, >>>>>> Reuven, and Romain, it sounds like there's no objection to the idea of >>>>>> SimpleFunction and SerializableFunction declaring that they throw >>>>>> Exception. So the discussion at this point is about whether there's an >>>>>> acceptable way to introduce that change. >>>>>> >>>>>> IIUC correctly, Kenn was suggesting that we need to ensure backwards >>>>>> compatibility for existing user code both at runtime and recompile, which >>>>>> means we can't simply add the declaration to the existing interfaces, >>>>>> since >>>>>> that would cause errors at compile time for user code directly invoking >>>>>> SerializableFunction instances. >>>>>> >>>>>> I don't see an obvious way that introducing a new functional >>>>>> interface would help without littering the API with more variants (it's >>>>>> already a bit confusing that i.e. MapElements has multiple via() methods >>>>>> to >>>>>> support three different function interfaces). >>>>>> >>>>>> Perhaps this kind of cleanup is best left for Beam 3.0? >>>>>> >>>>>> On Mon, Oct 15, 2018 at 6:51 PM Reuven Lax <[email protected]> wrote: >>>>>> >>>>>>> Compilation compatibility is a big part of what Beam aims to provide >>>>>>> with its guarantees. Romain makes a good point that most users are not >>>>>>> invoking SeralizableFunctions themselves (they are usually invoked >>>>>>> inside >>>>>>> of Beam classes such as MapElements), however I suspect some users do >>>>>>> these >>>>>>> things. >>>>>>> >>>>>>> On Sun, Oct 14, 2018 at 2:38 PM Kenneth Knowles <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Romain has brought up two good aspects of backwards compatibility: >>>>>>>> >>>>>>>> - runtime replacement vs recompile >>>>>>>> - consumer (covariant) vs producer (contravariant, such as >>>>>>>> interfaces a user implements) >>>>>>>> >>>>>>>> In this case, I think the best shoice is covariant and >>>>>>>> contravariant (invariant) backwards compat including recompile compat. >>>>>>>> But >>>>>>>> we shouldn't assume there is one obvious definition of "backwards >>>>>>>> compatibility". >>>>>>>> >>>>>>>> Does it help to introduce a new functional interface? >>>>>>>> >>>>>>>> Kenn >>>>>>>> >>>>>>>> On Sun, Oct 14, 2018 at 6:25 AM Romain Manni-Bucau < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Beam does not catch Exception for function usage so it will have >>>>>>>>> to do it in some places. >>>>>>>>> >>>>>>>>> A user does not have to execute the function so worse case it >>>>>>>>> impacts tests and in any case the most important: it does not impact >>>>>>>>> the >>>>>>>>> user until it recompiles the code (= runtime is not impacted). >>>>>>>>> >>>>>>>>> Romain Manni-Bucau >>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog >>>>>>>>> <http://rmannibucau.wordpress.com> | Github >>>>>>>>> <https://github.com/rmannibucau> | LinkedIn >>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book >>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >>>>>>>>> >>>>>>>>> >>>>>>>>> Le dim. 14 oct. 2018 à 15:19, Reuven Lax <[email protected]> a >>>>>>>>> écrit : >>>>>>>>> >>>>>>>>>> What in Beam codebase is not ready, and how do we know that user >>>>>>>>>> code doesn't have the same issue? >>>>>>>>>> >>>>>>>>>> On Sun, Oct 14, 2018 at 6:04 AM Romain Manni-Bucau < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hmm, tested also and it works until something in the codeflow >>>>>>>>>>> does not respect that constraint - see >>>>>>>>>>> com.sun.tools.javac.comp.Flow.FlowAnalyzer#errorUncaught. In other >>>>>>>>>>> words >>>>>>>>>>> beam codebase is not ready for that and will make it fail but it is >>>>>>>>>>> ok >>>>>>>>>>> cause we can fix it but user code does not rely on that so it is >>>>>>>>>>> fine to >>>>>>>>>>> update it normally. >>>>>>>>>>> >>>>>>>>>>> Romain Manni-Bucau >>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog >>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github >>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn >>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book >>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Le dim. 14 oct. 2018 à 14:39, Reuven Lax <[email protected]> a >>>>>>>>>>> écrit : >>>>>>>>>>> >>>>>>>>>>>> Just tried it, doesn't appear to work :( >>>>>>>>>>>> >>>>>>>>>>>> error: unreported exception Exception; must be caught or >>>>>>>>>>>> declared to be thrown >>>>>>>>>>>> >>>>>>>>>>>> On Sun, Oct 14, 2018 at 1:55 AM Romain Manni-Bucau < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> not with java>=8 AFAIK >>>>>>>>>>>>> >>>>>>>>>>>>> Romain Manni-Bucau >>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog >>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github >>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn >>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book >>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Le dim. 14 oct. 2018 à 10:49, Reuven Lax <[email protected]> a >>>>>>>>>>>>> écrit : >>>>>>>>>>>>> >>>>>>>>>>>>>> But it means that other functions that call >>>>>>>>>>>>>> SerializableFunctions must now declare exceptions, right? If >>>>>>>>>>>>>> yes, this is >>>>>>>>>>>>>> incompatible. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sun, Oct 14, 2018 at 1:37 AM Romain Manni-Bucau < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> No, only parameter types and return type is used to lookup >>>>>>>>>>>>>>> methods. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Romain Manni-Bucau >>>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog >>>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github >>>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn >>>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book >>>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Le dim. 14 oct. 2018 à 09:45, Reuven Lax <[email protected]> >>>>>>>>>>>>>>> a écrit : >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I've run into this problem before as well. Doesn't changing >>>>>>>>>>>>>>>> the signature involve a backwards-incompatible change though? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, Oct 3, 2018 at 5:11 PM Jeff Klukas < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I'm working on >>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/BEAM-5638 to add >>>>>>>>>>>>>>>>> exception handling options to single message transforms in >>>>>>>>>>>>>>>>> the Java SDK. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> MapElements' via() method is overloaded to accept either a >>>>>>>>>>>>>>>>> SimpleFunction, a SerializableFunction, or a Contextful, all >>>>>>>>>>>>>>>>> of which are >>>>>>>>>>>>>>>>> ultimately stored as a Contextful where the mapping >>>>>>>>>>>>>>>>> functionis expected to >>>>>>>>>>>>>>>>> have signature: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> OutputT apply(InputT element, Context c) throws Exception; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So Contextful.Fn allows throwing checked exceptions, but >>>>>>>>>>>>>>>>> neither >>>>>>>>>>>>>>>>> SerializableFunction nor SimpleFunction do. The >>>>>>>>>>>>>>>>> user-provided function has to satisfy the more restrictive >>>>>>>>>>>>>>>>> signature: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> OutputT apply(InputT input); >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Is there background about why we allow arbitrary checked >>>>>>>>>>>>>>>>> exceptions to be thrown in one case but not the other two? >>>>>>>>>>>>>>>>> Could we >>>>>>>>>>>>>>>>> consider expanding SerializableFunction and >>>>>>>>>>>>>>>>> SimpleFunction to the following?: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> OutputT apply(InputT input) throws Exception; >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This would, for example, simplify the implementation of >>>>>>>>>>>>>>>>> ParseJsons and AsJsons, where we have to catch an IOException >>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>> MapElements#via only to rethrow as RuntimeException. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>
