Hey all,

I've done a few checks to pinpoint the issue and it seems that I've just
fixed it.

Didn't know that before but the Flink, Spark and Direct Nexmark tests are
running on special Jenkins worker. The `apache-beam-jenkins-16` is labeled
with `beam-perf`, so only these tests can execute there. I'm not sure,
because the configuration on the old CI is already gone, but I guess that
this worker was configured to have only one executor (which I had missed).
That would forbid concurrent execution of the jobs and improve/stabilize
the timings.

That's how I currently configured the node and seems that the timings are
back to the pre-migration values:
http://104.154.241.245/d/ahuaA_zGz/nexmark?orgId=1&from=no:w-90d&to=now
<http://104.154.241.245/d/ahuaA_zGz/nexmark?orgId=1&from=now-90d&to=now>

Dataflow was not affected because it wasn't restricted to run on
`apache-beam-jenkins-16`.

Regards,
Damian


On Wed, Jul 22, 2020 at 5:11 PM Kenneth Knowles <k...@apache.org> wrote:

> Are Spark and Flink runners benchmarking against local clusters on the
> Jenkins VMs? Needless to say that is not a very controlled environment (and
> of course not realistic scale). That is probably why Dataflow was not
> affected. Is it possible that simply the different version of the Jenkins
> worker software and/or the instructions from the Cloudbees instance cause
> differing load?
>
> Kenn
>
> On Tue, Jul 21, 2020 at 4:17 PM Valentyn Tymofieiev <valen...@google.com>
> wrote:
>
>> FYI it looks like the transition to new Jenkins CI is visible on Nexmark
>> performance graphs[1][2]. Are new VM nodes less performant than old ones?
>>
>> [1] hhttp://
>> 104.154.241.245/d/ahuaA_zGz/nexmark?orgId=1&from=1587597387737&to=1595373387737&var-processingType=batch&var-ID=All&var-runner=All
>> [2]
>> https://issues.apache.org/jira/browse/BEAM-10542?focusedCommentId=17162374&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17162374
>>
>> On Thu, Jun 18, 2020 at 3:32 PM Tyson Hamilton <tyso...@google.com>
>> wrote:
>>
>>> Currently no. We're already experiencing a backlog of builds so the
>>> additional load would be a problem. I've opened two related issues that I
>>> think need completion before allowing non-committers to trigger tests:
>>>
>>> Load sharing improvements:
>>> https://issues.apache.org/jira/browse/BEAM-10281
>>> Admin access (maybe not required but nice to have):
>>> https://issues.apache.org/jira/browse/BEAM-10280
>>>
>>> I created https://issues.apache.org/jira/browse/BEAM-10282 to track
>>> opening up triggering for non-committers.
>>>
>>> On Thu, Jun 18, 2020 at 3:30 PM Luke Cwik <lc...@google.com> wrote:
>>>
>>>> Was about to ask the same question, so can non-committers trigger the
>>>> tests now?
>>>>
>>>> On Thu, Jun 18, 2020 at 11:54 AM Heejong Lee <heej...@google.com>
>>>> wrote:
>>>>
>>>>> This is awesome. Could non-committers also trigger the test now?
>>>>>
>>>>> On Wed, Jun 17, 2020 at 6:12 AM Damian Gadomski <
>>>>> damian.gadom...@polidea.com> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Good news, we've just migrated to the new CI:
>>>>>> https://ci-beam.apache.org. As from now beam projects at
>>>>>> builds.apache.org are disabled.
>>>>>>
>>>>>> If you experience any issues with the new setup please let me know,
>>>>>> either here or on ASF slack.
>>>>>>
>>>>>> Regards,
>>>>>> Damian
>>>>>>
>>>>>> On Mon, Jun 15, 2020 at 10:40 PM Damian Gadomski <
>>>>>> damian.gadom...@polidea.com> wrote:
>>>>>>
>>>>>>> Happy to see your positive response :)
>>>>>>>
>>>>>>> @Udi Meiri, Thanks for pointing that out. I've checked it and indeed
>>>>>>> it needs some attention.
>>>>>>>
>>>>>>> There are two things basing on my research:
>>>>>>>
>>>>>>>    - data uploaded by performance and load tests by the jobs,
>>>>>>>    directly to the influx DB - that should be handled automatically as 
>>>>>>> new
>>>>>>>    jobs will upload the same data in the same way
>>>>>>>    - data fetched using Jenkins API by the metrics tool
>>>>>>>    (syncjenkins.py) - here the situation is a bit more complex as the 
>>>>>>> script
>>>>>>>    relies on the build number (it's used actually as a time reference 
>>>>>>> and
>>>>>>>    primary key in the DB is created from it). To avoid refactoring of 
>>>>>>> the
>>>>>>>    script and database migration to use timestamp instead of build 
>>>>>>> number I've
>>>>>>>    just "fast-forwarded" the numbers on the new
>>>>>>>    https://ci-beam.apache.org to follow current numbering from the
>>>>>>>    old CI. Therefore simple replacement of the Jenkins URL in the 
>>>>>>> metrics
>>>>>>>    scripts should do the trick to have continuous metrics data. I'll 
>>>>>>> check
>>>>>>>    that tomorrow on my local grafana instance.
>>>>>>>
>>>>>>> Please let me know if there's anything that I missed.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Damian
>>>>>>>
>>>>>>> On Mon, Jun 15, 2020 at 8:05 PM Alexey Romanenko <
>>>>>>> aromanenko....@gmail.com> wrote:
>>>>>>>
>>>>>>>> Great! Thank you for working on this and letting us know.
>>>>>>>>
>>>>>>>> On 12 Jun 2020, at 16:58, Damian Gadomski <
>>>>>>>> damian.gadom...@polidea.com> wrote:
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> During the last few days, I was preparing for the Beam Jenkins
>>>>>>>> migration from builds.apache.org to ci-beam.apache.org. The new
>>>>>>>> Jenkins Master will be dedicated only for Beam related jobs, all Beam
>>>>>>>> Committers will have build configure access, and Beam PMC will have 
>>>>>>>> Admin
>>>>>>>> (GUI) Access.
>>>>>>>>
>>>>>>>> We (in cooperation with Infra) are almost ready for the migration
>>>>>>>> itself and I want to share with you the details of our plan. We are
>>>>>>>> planning to start the migration next week, most likely on Tuesday. I'll
>>>>>>>> keep you updated on the progress. We do not expect any issues nor the
>>>>>>>> outage of the CI services, everything should be more or less 
>>>>>>>> unnoticeable.
>>>>>>>> Just don't be surprised that the Jenkins URL will change to
>>>>>>>> https://ci-beam.apache.org
>>>>>>>>
>>>>>>>> If you are curious, here are the steps that we are going to take:
>>>>>>>>
>>>>>>>> 1. Create 16 new CI nodes that will be connected to the new CI. We
>>>>>>>> will then have simultaneously running two CI servers.
>>>>>>>> 2. Verify that new builds work as expected on the new instance
>>>>>>>> (compare results of cron builds). (a day or two would be sufficient)
>>>>>>>> 3. Move the responsibility of Phrase/PR/Commit builds to the new
>>>>>>>> CI, disable on the old one.
>>>>>>>> 4. Modify the .test-infra/jenkins/README.md to point to the new
>>>>>>>> instance and replace Post-commit tests status in README.md and
>>>>>>>> .github/PULL_REQUEST_TEMPLATE.md
>>>>>>>> 5. Disable the jobs on the old Jenkins and add a description to
>>>>>>>> each job with the URL to the corresponding one on the new CI.
>>>>>>>> 6. Turn off VM instances of the old nodes.
>>>>>>>> 7. Remove VM instances of the old nodes.
>>>>>>>>
>>>>>>>> In case of any questions or doubts feel free to ask :)
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Damian
>>>>>>>>
>>>>>>>>
>>>>>>>>

Reply via email to