+1 for naming as python containers, and quick release so users can try it.

Not related to this tnread but I am also curious about the reasons to remove the
go docker images, was this discussed/voted in the ML (maybe I missed it) ?

I don't think Beam has been historically a conservative project about releasing
early in-progress versions and I have learnt to appreciate this because it helps
for early user testing and bug reports which will be definitely a must for Java
11.

We should read the ticket Kyle mentions with a grain of salt. Most of the
sub-tasks in that ticket are NOT about allowing users to run pipelines with Java
11 but about been able to fully build and run the tests and the source code
ofBeam with Java 11 which is a different goal (important but probably less for
end users) and a task with lots of extra issues because of plugins / dependent
systems etc.

For the Java 11 harness what we need is to guarantee is that users can run their
code without issues with Java 11 and we can do this now for example by checking
that portable runners that support Java 11 pass ValidatesRunner with the Java 11
harness. Since some classic runners [1] already pass these tests, it should be
relatively 'easy' to do so for portable runners.

[1] 
https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/




On Sat, Jul 11, 2020 at 12:43 AM Ahmet Altay <[email protected]> wrote:
>
> Related to the naming question, +1 and this will be similar to the python 
> container naming (e.g. beam_python3.7_sdk).
>
> On Fri, Jul 10, 2020 at 1:46 PM Pablo Estrada <[email protected]> wrote:
>>
>> I agree with Kenn. Dataflow already has some publishing of non-portable JAva 
>> 11 containers, so I think it'll be great to formalize the process for 
>> portable containers, and let users play with it, and know of its 
>> availability.
>> Best
>> -P.
>>
>> On Fri, Jul 10, 2020 at 9:42 AM Kenneth Knowles <[email protected]> wrote:
>>>
>>> To the initial question: I'm +1 on the rename. The container is primarily 
>>> something that the SDK should insert into the pipeline proto during 
>>> construction, and only user-facing in more specialized situations. Given 
>>> the state of Java and portability, it is a good time to get things named 
>>> properly and unambiguously. I think a brief announce to dev@ and user@ when 
>>> it happens is nice-to-have, but no need to give advance warning.
>>>
>>> Kenn
>>>
>>> On Fri, Jul 10, 2020 at 7:58 AM Kenneth Knowles <[email protected]> wrote:
>>>>
>>>> I believe Beam already has quite a few users that have forged ahead and 
>>>> used Java 11 with various runners, pre-portability. Mostly I believe the 
>>>> Java 11 limitations are with particular features (Schema codegen) and 
>>>> extensions/IOs/transitive deps.
>>>>
>>>> When it comes to the container, I'd be interested in looking at test 
>>>> coverage. The Flink & Spark portable ValidatesRunner suites use EMBEDDED 
>>>> environment, so they don't exercise the container. The first testing of 
>>>> the Java SDK harness container against the Python-based Universal Local 
>>>> Runner is in pull request now [1]. Are there other test suites to 
>>>> highlight? How hard would it be to run Flink & Spark against the 
>>>> container(s) too?
>>>>
>>>> Kenn
>>>>
>>>> [1] https://github.com/apache/beam/pull/11792 (despite the name 
>>>> ValidatesRunner, in this case it is validating both the runner and 
>>>> harness, since we don't have a compliance test suite for SDK harnesses)
>>>>
>>>> On Fri, Jul 10, 2020 at 7:54 AM Tyson Hamilton <[email protected]> wrote:
>>>>>
>>>>> What do we consider 'ready'?
>>>>>
>>>>> Maybe the only required outstanding bugs are supporting the direct runner 
>>>>> (BEAM-10085), core tests (BEAM-10081), IO tests (BEAM-10084)  to start 
>>>>> with? Notably this would exclude failing tests like those for GCP core, 
>>>>> GCPIOs, Dataflow runner, Spark runner, Flink runner, Samza.
>>>>>
>>>>>
>>>>> On Thu, Jul 9, 2020 at 4:44 PM Kyle Weaver <[email protected]> wrote:
>>>>>>
>>>>>> My main question is, are we confident the Java 11 container is ready to 
>>>>>> release? AFAIK there are still a number of issues blocking full Java 11 
>>>>>> support (cf [1]; not sure how many of these, if any, affect the SDK 
>>>>>> harness specifically though.)
>>>>>>
>>>>>> For comparison, we recently decided to stop publishing Go SDK containers 
>>>>>> until the Go SDK is considered mature [2]. In the meantime, those who 
>>>>>> want to use the Go SDK can build their own container images from source.
>>>>>>
>>>>>> Do we already have a Gradle task to build Java 11 containers? If not, 
>>>>>> this would be a good intermediate step, letting users opt-in to Java 11 
>>>>>> without us overpromising support.
>>>>>
>>>>>
>>>>> We do not. From what I can tell, the build.gradele [1] for the Java 
>>>>> container is only for the one version. There is a docker file used for 
>>>>> Jenkins tests.
>>>>>
>>>>> [1] 
>>>>> https://github.com/apache/beam/blob/master/sdks/java/container/build.gradle
>>>>>
>>>>>>
>>>>>>
>>>>>> When we eventually do the renaming, we can add a note to CHANGES.md [3].
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/BEAM-10090
>>>>>> [2] https://issues.apache.org/jira/browse/BEAM-9685
>>>>>> [3] https://github.com/apache/beam/blob/master/CHANGES.md
>>>>>>
>>>>>> On Thu, Jul 9, 2020 at 3:44 PM Emily Ye <[email protected]> wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I'm getting ramped up on contributing and was looking into adding the 
>>>>>>> Java 11 harness container to releases 
>>>>>>> (https://issues.apache.org/jira/browse/BEAM-8106) - should I rename the 
>>>>>>> current java container so we have two new images `beam_java8_sdk` and 
>>>>>>> `beam_java11_sdk` or hold off on renaming? If we do rename it, what 
>>>>>>> steps should I take to announce/document the change?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Emily

Reply via email to