On Wed, Apr 29, 2020 at 3:49 PM Valentyn Tymofieiev <valen...@google.com> wrote:
> > > On Tue, Apr 28, 2020 at 5:03 PM Robert Bradshaw <rober...@google.com> > wrote: > >> On Tue, Apr 28, 2020 at 4:46 PM Hannah Jiang <hannahji...@google.com> >> wrote: >> >>> I guess I assumed there was some reason we needed "lightweight images" >>>> in our tests (because licenses take up a lot of space IIRC), but maybe not. >>>> Can you elaborate on the purpose of this option Hannah? >>> >>> Reducing image size is a good reason to have the pull option. There were >>> user requests to create lightweight images. Now I think users can use skip >>> option to create lightweight iamges. pull option is not needed. >>> >>> Then the discussion becomes which one should be the default mode? >>> According to feedback, skip should be the default mode, and change it to >>> add mode when running Jenkins test. And the docker-pull-licenses tag is >>> binary again. >>> It seems like there is an easy way to pass the tag to all Jenkins test, >>> which is adding *context.switches("-Pdocker-pull-licenses"**)* to >>> CommonJobProperties.groovy >>> <https://github.com/apache/beam/blob/master/.test-infra/jenkins/CommonJobProperties.groovy#L150>. >>> I'm not very familiar with Jenkins, does this would work as expected? >>> If it sounds like a bad idea to pass this tag to ALL tests, I would ask >>> help from community to identify those tests which create docker images >>> (Java and Python for now) and pass the tag to the tests. >>> >> >> That could work. >> >> The other question is, how easy would it be for a release manager to >> accidentally push non-compliant images? >> > > Echoing my previous question: would it make sense to add licenses whenever > we build containers on the release branch, which can be checked by [1]? > I think this is a good idea. We can also add a validation task to the validation spreadsheet to manually check that containers are shipped with licenses. > [1] > https://github.com/apache/beam/blob/74a6565c8b64d9fadf256370df47a4c5dadafb55/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L296 > > > > >> >> We have definitely not explored how cheap we can make doing the "real >> thing" can be, which would IMHO be best. >> >> >>> >>> >>> On Tue, Apr 28, 2020 at 3:50 PM Kyle Weaver <kcwea...@google.com> wrote: >>> >>>> > To overwrite, we need to pass the tag for ALL Jenkins tests that >>>> create docker images, which is quite a bunch of work. >>>> >>>> If it comes to that, I'd rather we do the work of passing tags instead >>>> of users. >>>> >>>> > Does anyone know if there is an easy way to pass the tag to many >>>> tests? or overwrite the default mode, which will be defined at >>>> gradle.properties, to pull ONLY with Jenkins tests? >>>> >>>> There is precedence for the latter option: >>>> https://github.com/apache/beam/blob/1905dbde5ca0858fce89f65c761e88511840a384/build.gradle#L45 >>>> >>>> > (On that note, however, pull seems bad for both.) >>>> >>>> I guess I assumed there was some reason we needed "lightweight images" >>>> in our tests (because licenses take up a lot of space IIRC), but maybe not. >>>> Can you elaborate on the purpose of this option Hannah? >>>> >>>> On Tue, Apr 28, 2020 at 6:40 PM Robert Bradshaw <rober...@google.com> >>>> wrote: >>>> >>>>> Fundamentally having license checking off by default is dangerous for >>>>> releases, but having it on by default is annoying for developers. (On that >>>>> note, however, pull seems bad for both.) Is it possible to make it a >>>>> gradle >>>>> target that only runs when something (specifically dependencies, or the >>>>> files declaring them) have changed, and would leverage the gradle cache >>>>> (and hence be cheap) otherwise? >>>>> >>>>> Alternatively, I think we should invest in URL caching. These urls >>>>> should be immutable; let's only download them once, ever. >>>>> >>>>> On Tue, Apr 28, 2020 at 3:31 PM Hannah Jiang <hannahji...@google.com> >>>>> wrote: >>>>> >>>>>> Do you mean set the default mode to skip and overwrite it to pull >>>>>> mode with Jenkins test? To overwrite, we need to pass the tag for ALL >>>>>> Jenkins tests that create docker images, which is quite a bunch of work. >>>>>> Does anyone know if there is an easy way to pass the tag to many >>>>>> tests? or overwrite the default mode, which will be defined at >>>>>> gradle.properties, to pull ONLY with Jenkins tests? We can add a step to >>>>>> do >>>>>> this after cloning a git branch to Jenkins machine. But it seems easier >>>>>> to >>>>>> change the local gradle.properties for local development purpose..? >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Apr 28, 2020 at 3:09 PM Thomas Weise <t...@apache.org> wrote: >>>>>> >>>>>>> Can this be solved by enabling the license magic for the Jenkins >>>>>>> jobs? >>>>>>> >>>>>>> I also think the default should be off for better local development >>>>>>> experience. >>>>>>> >>>>>>> On Tue, Apr 28, 2020 at 3:05 PM Hannah Jiang <hannahji...@google.com> >>>>>>> wrote: >>>>>>> >>>>>>>> We need to make sure the release images contains licenses/notices >>>>>>>> in order to avoid potential legal issues. The purpose of setting the >>>>>>>> default to pull is checking PRs which introduce new dependencies >>>>>>>> include >>>>>>>> thier licenses as well, in a way of auto pulling or tool pulling, to >>>>>>>> make >>>>>>>> sure all licenses are included when we create release images. If >>>>>>>> setting >>>>>>>> default to skip, we only check licenses when we create release images >>>>>>>> and >>>>>>>> may see many issues with the licenses, and the release would be >>>>>>>> delayed, >>>>>>>> maybe a lot. >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Apr 28, 2020 at 2:50 PM Kyle Weaver <kcwea...@google.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> The different modes make sense. One disagreement: I think the >>>>>>>>> default should be skip. I imagine few users would want to put >>>>>>>>> licenses in >>>>>>>>> their own images. >>>>>>>>> >>>>>>>>> On Tue, Apr 28, 2020 at 5:36 PM Hannah Jiang < >>>>>>>>> hannahji...@google.com> wrote: >>>>>>>>> >>>>>>>>>> The above 1,2,3,4 are merged to master. >>>>>>>>>> Here I would like to change behaviors with *docker-pull-licenses* >>>>>>>>>> tag and would like to collect feedbacks from community before >>>>>>>>>> finalizing my >>>>>>>>>> PR. >>>>>>>>>> >>>>>>>>>> docker-pull-licenses can be set to one of ['add', 'pull', 'skip']. >>>>>>>>>> >>>>>>>>>> - *docker-pull-licenses=add* pulls licenses/notices and the >>>>>>>>>> files are added to docker images. This tag is used to create >>>>>>>>>> release docker >>>>>>>>>> images. >>>>>>>>>> - *docker-pull-licenses=pull* pulls licenses/notices but do >>>>>>>>>> not add the files to docker images so that create lightweight >>>>>>>>>> images. This >>>>>>>>>> is the default mode, and docker images created by Jenkins test >>>>>>>>>> use this >>>>>>>>>> mode. This will make sure the files are all pullable and detect >>>>>>>>>> issues with >>>>>>>>>> the tool, except the ADD part at DockerFile, which is rarely >>>>>>>>>> likely to be >>>>>>>>>> an issue. It is better than the checking URLs approach mentioned >>>>>>>>>> above, >>>>>>>>>> because it is a more similar process like creating release images >>>>>>>>>> and code >>>>>>>>>> can be simplified so that make it easier to maintain. >>>>>>>>>> - *docker-pull-licenses=skip* skips all license pulling >>>>>>>>>> related tasks. This tag is provided for users who customize their >>>>>>>>>> images >>>>>>>>>> with DockerFile provided by us and would not like to deal with >>>>>>>>>> license >>>>>>>>>> stuff. Without this tag, the users should change the .gradle file >>>>>>>>>> to get >>>>>>>>>> rid of it. It also can be used for local development when >>>>>>>>>> developers want >>>>>>>>>> to solve license related issues later. >>>>>>>>>> >>>>>>>>>> I would like to see this tag is adopted by all docker images >>>>>>>>>> released by Beam, including Flink and Spark job server images, and >>>>>>>>>> share >>>>>>>>>> the same default mode. >>>>>>>>>> Does this sound good? Are there any concerns? >>>>>>>>>> >>>>>>>>>> Please let me know what you think. >>>>>>>>>> Hannah >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Apr 16, 2020 at 11:46 AM Hannah Jiang < >>>>>>>>>> hannahji...@google.com> wrote: >>>>>>>>>> >>>>>>>>>>> Ok, so the fianl approach is >>>>>>>>>>> 1. Using multi threading with 16 threads. >>>>>>>>>>> 2. Introduce docker-pull-license tag, by default it is disabled. >>>>>>>>>>> 3. By default, only check if urls return 200, not actually pull >>>>>>>>>>> the files to reduce image size. Licenses add 85MB of image size as >>>>>>>>>>> of today. >>>>>>>>>>> 4. Add retries to avoid flakiness. Initially it was added to >>>>>>>>>>> download part only, assuming urlib is stable when validates urls, >>>>>>>>>>> but it is >>>>>>>>>>> not true. >>>>>>>>>>> >>>>>>>>>>> For caching, it is definitely useful. It will mitigate flakiness >>>>>>>>>>> further, like when github is down (many urls point to github). It >>>>>>>>>>> also >>>>>>>>>>> reduces unnecessary internet traffic. >>>>>>>>>>> Let's reconsider it after the current work is merged to 2.21.0. >>>>>>>>>>> >>>>>>>>>>> Thank you all for providing feedback and ideas. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Apr 16, 2020 at 10:24 AM Kyle Weaver < >>>>>>>>>>> kcwea...@google.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> > I tried with multi processing and it improved the performance >>>>>>>>>>>> a lot. >>>>>>>>>>>> >>>>>>>>>>>> Great! Though it won't help with flakes, so as you said we >>>>>>>>>>>> should still look into caching as well. >>>>>>>>>>>> >>>>>>>>>>>> > You could try using a threadpool >>>>>>>>>>>> >>>>>>>>>>>> +1 >>>>>>>>>>>> >>>>>>>>>>>> > However, we would like to include this work as part of 2.21.0 >>>>>>>>>>>> >>>>>>>>>>>> I had marked the jira as a blocker for 2.21.0 because I was >>>>>>>>>>>> afraid something was broken, but now it looks like the failures >>>>>>>>>>>> were just >>>>>>>>>>>> flakes. So BEAM-9764 >>>>>>>>>>>> <https://issues.apache.org/jira/browse/BEAM-9764> should not >>>>>>>>>>>> be a release blocker. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Apr 16, 2020 at 1:00 PM Thomas Weise <t...@apache.org> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Hannah, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for investigating! >>>>>>>>>>>>> >>>>>>>>>>>>> I think it would be great to eliminate the overhead for local >>>>>>>>>>>>> builds (by default turn off the license assembly) and make it as >>>>>>>>>>>>> lightweight as possible in the frequent CI path. >>>>>>>>>>>>> >>>>>>>>>>>>> Thomas >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Apr 16, 2020 at 1:37 AM Hannah Jiang < >>>>>>>>>>>>> hannahji...@google.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I tried to check if urls are valid instead of pulling the >>>>>>>>>>>>>> files and it reduced only 1 min of running time. So, it's not an >>>>>>>>>>>>>> option >>>>>>>>>>>>>> here. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I tried with multi processing and it improved the performance >>>>>>>>>>>>>> a lot. >>>>>>>>>>>>>> With 12 subprocesses, the running time reduced to 49 seconds, >>>>>>>>>>>>>> and with 16 cores, it reduced to 18 seconds. >>>>>>>>>>>>>> The number of subprocesses is defined by the number of cores, >>>>>>>>>>>>>> and Jenkins machine has 16 cores. >>>>>>>>>>>>>> FYI: with my local machine (12 cores) and home network, it >>>>>>>>>>>>>> takes 1min 40 secs to create a Java docker image. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The caching approach mentioned by Robert brings many >>>>>>>>>>>>>> benefits, not only to this use case. >>>>>>>>>>>>>> However, we would like to include this work as part of >>>>>>>>>>>>>> 2.21.0, so I will move with the multi processing approach this >>>>>>>>>>>>>> time. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please let me know if you have objections. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Apr 15, 2020 at 4:01 PM Robert Bradshaw < >>>>>>>>>>>>>> rober...@google.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Is the cost primarily in pulling these remote >>>>>>>>>>>>>>> licenses/sources? I'd guess that 99.9% of the URLs remain the >>>>>>>>>>>>>>> same from run >>>>>>>>>>>>>>> to run. Would a simple cache, or caching proxy, be sufficient? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Otherwise, a tag to check that licenses can be pulled, but >>>>>>>>>>>>>>> not really pull them, might be sufficient. (Making sure the >>>>>>>>>>>>>>> default is >>>>>>>>>>>>>>> cheap but we don't accidentally omit them when it matters is >>>>>>>>>>>>>>> the tricky bit >>>>>>>>>>>>>>> I see here.) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Apr 15, 2020 at 3:38 PM Hannah Jiang < >>>>>>>>>>>>>>> hannahji...@google.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks for providing feedback. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Here is what happending now and I would discuss when to run >>>>>>>>>>>>>>>> the job. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *Why it takes 7-8 mins for Java?* >>>>>>>>>>>>>>>> When we list dependencies both in runtime and compile >>>>>>>>>>>>>>>> environment, there are almost 1400 third party dependencies >>>>>>>>>>>>>>>> and we need to >>>>>>>>>>>>>>>> pull licenses/notices for all of them. >>>>>>>>>>>>>>>> In addition, we need to pull source code if license is >>>>>>>>>>>>>>>> CDDL, MPL, GLP or LGPL. 69 of the dependencies need to pull >>>>>>>>>>>>>>>> the source code >>>>>>>>>>>>>>>> as of 4/14/2020. >>>>>>>>>>>>>>>> Getting dependency list + pulling licenses/notices/source >>>>>>>>>>>>>>>> code takes 7-8 minutes. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Now I see there are *two patterns of failures*. >>>>>>>>>>>>>>>> 1. In valid URLs. In fact, the urls are not invalid, but >>>>>>>>>>>>>>>> occationally, it returns URLError. This can be resolved by >>>>>>>>>>>>>>>> adding retries. >>>>>>>>>>>>>>>> However, it will add runtime to the job. >>>>>>>>>>>>>>>> 2. No artifacts available. Sometimes, when a new version of >>>>>>>>>>>>>>>> package is released and the plugin still looks for staging >>>>>>>>>>>>>>>> location. For >>>>>>>>>>>>>>>> example, new zetasql packages were released on 4/14, and today >>>>>>>>>>>>>>>> I saw >>>>>>>>>>>>>>>> several failures with looking for staging repo. The behavior >>>>>>>>>>>>>>>> is not >>>>>>>>>>>>>>>> consistent, sometimes it scans correct location, sometimes >>>>>>>>>>>>>>>> not. This can be >>>>>>>>>>>>>>>> resolved by running the job again. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *When the job is running?* >>>>>>>>>>>>>>>> generateThirdPartyLicenses is added to :sdks:java:container >>>>>>>>>>>>>>>> and it is an upstream of the docker task. As such, whenever a >>>>>>>>>>>>>>>> docker is >>>>>>>>>>>>>>>> created, the job is triggered. >>>>>>>>>>>>>>>> :sdks:java:container:docker is added to Java PreSubmit job. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> *How to improve it?* >>>>>>>>>>>>>>>> According to some ideas provided above, how about doing >>>>>>>>>>>>>>>> this? >>>>>>>>>>>>>>>> Introduce a tag (ie: pull-licenses) to docker job to decide >>>>>>>>>>>>>>>> if pull the files. Default tag is NOT setting pull-licenses. >>>>>>>>>>>>>>>> When pull-licenses is not set, it checks if the >>>>>>>>>>>>>>>> licenses/notices/source code can be pull automaticall or they >>>>>>>>>>>>>>>> have urls to >>>>>>>>>>>>>>>> pull from, but don't really pull. >>>>>>>>>>>>>>>> When pull-license is set, files are pulled. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> For each PR (Presubmit): applying default option. The test >>>>>>>>>>>>>>>> would fail if the files cannot be pulled, so committers still >>>>>>>>>>>>>>>> need to fix >>>>>>>>>>>>>>>> dependency errors. I believe it would reduce the running time. >>>>>>>>>>>>>>>> For release: set the tag and pull the files and source >>>>>>>>>>>>>>>> code. Since it is checked for each PR, pulling should finish >>>>>>>>>>>>>>>> without >>>>>>>>>>>>>>>> problems. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please let me know what you think and if there are other >>>>>>>>>>>>>>>> things can be improved. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hannah >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, Apr 15, 2020 at 2:30 PM Kyle Weaver < >>>>>>>>>>>>>>>> kcwea...@google.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Looks like the same error as this Jira: >>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/BEAM-9764 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Even if/when we are able to fix this particular issue, I >>>>>>>>>>>>>>>>> agree it is best not to run this job except for releases >>>>>>>>>>>>>>>>> because of the >>>>>>>>>>>>>>>>> inherent network cost and possible reliability issues. +Hannah >>>>>>>>>>>>>>>>> Jiang <hannahji...@google.com> What do you think? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Wed, Apr 15, 2020 at 5:20 PM Thomas Weise < >>>>>>>>>>>>>>>>> t...@apache.org> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The new feature to assemble licenses is very useful but >>>>>>>>>>>>>>>>>> appears to add several minutes (7-8?) build time to jobs >>>>>>>>>>>>>>>>>> that need to >>>>>>>>>>>>>>>>>> build a container. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Does it also seem to cause occasional build failures? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Phrase/131/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Would it be possible to perform this task only during >>>>>>>>>>>>>>>>>> release builds? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Thomas >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>