Thanks for providing feedback.

Here is what happending now and I would discuss when to run the job.

*Why it takes 7-8 mins for Java?*
When we list dependencies both in runtime and compile environment, there
are almost 1400 third party dependencies and we need to pull
licenses/notices for all of them.
In addition, we need to pull source code if license is CDDL, MPL, GLP or
LGPL. 69 of the dependencies need to pull the source code as of 4/14/2020.
Getting dependency list + pulling licenses/notices/source code takes 7-8
minutes.

Now I see there are *two patterns of failures*.
1. In valid URLs. In fact, the urls are not invalid, but occationally, it
returns URLError. This can be resolved by adding retries. However, it will
add runtime to the job.
2. No artifacts available. Sometimes, when a new version of package is
released  and the plugin still looks for staging location. For example, new
zetasql packages were released on 4/14, and today I saw several failures
with looking for staging repo. The behavior is not consistent, sometimes it
scans correct location, sometimes not. This can be resolved by running the
job again.

*When the job is running?*
generateThirdPartyLicenses is added to :sdks:java:container and it is an
upstream of the docker task. As such, whenever a docker is created, the job
is triggered.
:sdks:java:container:docker is added to Java PreSubmit job.

*How to improve it?*
According to some ideas provided above, how about doing this?
Introduce a tag (ie: pull-licenses) to docker job to decide if pull the
files. Default tag is NOT setting pull-licenses.
When pull-licenses is not set, it checks if the licenses/notices/source
code can be pull automaticall or they have urls to pull from, but don't
really pull.
When pull-license is set, files are pulled.

For each PR (Presubmit): applying default option. The test would fail if
the files cannot be pulled, so committers still need to fix dependency
errors. I believe it would reduce the running time.
For release: set the tag and pull the files and source code. Since it is
checked for each PR, pulling should finish without problems.

Please let me know what you think and if there are other things can be
improved.

Hannah



On Wed, Apr 15, 2020 at 2:30 PM Kyle Weaver <kcwea...@google.com> wrote:

> Looks like the same error as this Jira:
> https://issues.apache.org/jira/browse/BEAM-9764
>
> Even if/when we are able to fix this particular issue, I agree it is best
> not to run this job except for releases because of the inherent network
> cost and possible reliability issues. +Hannah Jiang
> <hannahji...@google.com> What do you think?
>
> On Wed, Apr 15, 2020 at 5:20 PM Thomas Weise <t...@apache.org> wrote:
>
>> The new feature to assemble licenses is very useful but appears to add
>> several minutes (7-8?)  build time to jobs that need to build a container.
>>
>> Does it also seem to cause occasional build failures?
>>
>> https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Phrase/131/
>>
>> Would it be possible to perform this task only during release builds?
>>
>> Thanks,
>> Thomas
>>
>>

Reply via email to