Re: Jenkins – running ovirt-vmconsole builds

2018-11-05 Thread Barak Korren
On Mon, 5 Nov 2018 at 14:21, Tomasz Barański  wrote:

> Hello,
>
> I'm trying to diagnose a failing build[1] in ovirt-vmconsole. It builds
> fine in
> a Fedora VM on my laptop, but fails on the Jenkins server.
>
> Is there a way to trigger the build with changes from a specific Gerrit
> changeset? Or maybe I can get permissions to run the build manually?
>
>
You can add 'ci build please' as a comment in Gerrit to run the build job
on the latest patchset for the patch.

If this is something you'd do often, I;d recommnet adding the build process
to check-patch.sh

You can also easily emulate how the CI runs things locally using
mock_runner.sh [1]

[1]: https://ovirt-infra-docs.readthedocs.io/en/latest/CI/Using_mock_runner/


>
> Tomo
>
> [1] https://gerrit.ovirt.org/#/c/95175/
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/HRNZRDGJMOPJB7S4KA2LVNA6OHQ4TFM4/
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/G46C3CQNY2LYQZ7B7JIX2L7OMRDYRSGF/


Re: [ovirt-devel] [CQ ovirt master] [ovirt-engine] - not passing for 10 days

2018-11-15 Thread Barak Korren
py/lago-basic-suite-master-engine/_var_log/ovirt-engine/engine.log/*view*/
>> >> > > > >>>>
>> >> > > > >>>> Is this helpful for you?
>> >> > > > >>>>
>> >> > > > >>>>
>> >> > > > >>>>
>> >> > > > >>>> actually, there ire two issues
>> >> > > > >>>> 1) cluster is still 4.3 even after Martin’s revert.
>> >> > > > >>>>
>> >> > > > >>>
>> >> > > > >>> https://gerrit.ovirt.org/#/c/95409/ should align cluster
>> level with dc level
>> >> > > > >>>
>> >> > > > >>
>> >> > > > >> This change aligns the cluster level, but
>> >> > > > >>
>> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/3502/parameters/
>> >> > > > >> consuming build result from
>> >> > > > >>
>> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/11121/
>> >> > > > >> looks like that this does not solve the issue:
>> >> > > > >>  File
>> "/home/jenkins/workspace/ovirt-system-tests_manual/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
>> line 698, in run_vms
>> >> > > > >>api.vms.get(VM0_NAME).start(start_params)
>> >> > > > >>  File
>> "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/brokers.py", line
>> 31193, in start
>> >> > > > >>headers={"Correlation-Id":correlation_id}
>> >> > > > >>  File
>> "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py", line
>> 122, in request
>> >> > > > >>persistent_auth=self.__persistent_auth
>> >> > > > >>  File
>> "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py",
>> line 79, in do_request
>> >> > > > >>persistent_auth)
>> >> > > > >>  File
>> "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py",
>> line 162, in __do_request
>> >> > > > >>raise errors.RequestError(response_code, response_reason,
>> response_body)
>> >> > > > >> RequestError:
>> >> > > > >> status: 400
>> >> > > > >> reason: Bad Request
>> >> > > > >>
>> >> > > > >> engine.log:
>> >> > > > >> 2018-11-14 03:10:36,802-05 INFO
>> [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (default task-3)
>> [99e282ea-577a-4dab-857b-285b1df5e6f6] Candidate host
>> 'lago-basic-suite-master-host-0' ('4dbfb937-ac4b-4cef-8ae3-124944829add')
>> was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'CPU-Level'
>> (correlation id: 99e282ea-577a-4dab-857b-285b1df5e6f6)
>> >> > > > >> 2018-11-14 03:10:36,802-05 INFO
>> [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (default task-3)
>> [99e282ea-577a-4dab-857b-285b1df5e6f6] Candidate host
>> 'lago-basic-suite-master-host-1' ('731e5055-706e-4310-a062-045e32ffbfeb')
>> was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'CPU-Level'
>> (correlation id: 99e282ea-577a-4dab-857b-285b1df5e6f6)
>> >> > > > >> 2018-11-14 03:10:36,802-05 ERROR
>> [org.ovirt.engine.core.bll.RunVmCommand] (default task-3)
>> [99e282ea-577a-4dab-857b-285b1df5e6f6] Can't find VDS to run the VM
>> 'dc1e1e92-1e5c-415e-8ac2-b919017adf40' on, so this VM will not be run.
>> >> > > > >>
>> >> > > > >>
>> >> > > > >
>> >> > > > >
>> >> > > > > https://gerrit.ovirt.org/#/c/95283/ results in
>> >> > > > >
>> http://jenkins.ovirt.org/job/ovirt-engine_master_build-artifacts-el7-x86_64/8071/
>> >> > > > > which is used in
>> >> > > > >
>> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/3504/parameters/
>> >> > > > > results in run_vms succeeding.
>> >> > > > >
>> >> > > > > The next merged change
>> >> > > > > https://gerrit.ovirt.org/#/c/95310/ results in
>> >> > > > >
>> http://jenkins.ovirt.org/job/ovirt-engine_master_build-artifacts-el7-x86_64/8072/
>> >> > > > > which is used in
>> >> > > > >
>> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/3505/parameters/
>> >> > > > > results in run_vms failing with
>> >> > > > >  File
>> "/home/jenkins/workspace/ovirt-system-tests_manual/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
>> line 698, in run_vms
>> >> > > > >api.vms.get(VM0_NAME).start(start_params)
>> >> > > > >  File
>> "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/brokers.py", line
>> 31193, in start
>> >> > > > >headers={"Correlation-Id":correlation_id}
>> >> > > > >  File
>> "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py", line
>> 122, in request
>> >> > > > >persistent_auth=self.__persistent_auth
>> >> > > > >  File
>> "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py",
>> line 79, in do_request
>> >> > > > >persistent_auth)
>> >> > > > >  File
>> "/usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/connectionspool.py",
>> line 162, in __do_request
>> >> > > > >raise errors.RequestError(response_code, response_reason,
>> response_body)
>> >> > > > > RequestError:
>> >> > > > > status: 400
>> >> > > > > reason: Bad Request
>> >> > > > >
>> >> > > > >
>> >> > > > > So even if the Cluster Level should be 4.2 now,
>> >> > > > > still https://gerrit.ovirt.org/#/c/95310/ seems influence the
>> behavior.
>> >> > > >
>> >> > > > I really do not see how it can affect 4.2.
>> >> > >
>> >> > > Me neither.
>> >> > >
>> >> > > > Are you sure the cluster is really 4.2? Sadly it’s not being
>> logged at all
>> >> > >
>> >> > > screenshot from local execution https://imgur.com/a/yiWBw3c
>> >> > >
>> >> > > > But if it really seem to matter (and since it needs a fix anyway
>> for 4.3) feel free to revert it of course
>> >> > > >
>> >> > >
>> >> > > I will post a revert change and check if this changes the behavior.
>> >> >
>> >> > Dominik, thanks for the research and for Martin's and your
>> >> > reverts/fixes. Finally Engine passes OST
>> >> >
>> https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/11153/
>> >> > and QE can expect a build tomorrow, after 2 weeks of droughts.
>> >>
>> >> unfortunately, the drought continues.
>> >
>> >
>> > Sorry, missing the content or meaning, what does drought means?
>>
>> Pardon my flowery language. I mean 2 weeks of no ovirt-engine builds.
>>
>> >
>> >>
>> >> Barrak tells me that something is broken in the nightly cron job
>> >> copying the the tested repo onto the master-snapshot one.
>> >
>> >
>> > Dafna, can you check this?
>> >
>> >>
>> >>
>> >> +Edri: please make it a priority to have it fixed.
>> >
>> >
>> >
>> > --
>> >
>> > Eyal edri
>> >
>> >
>> > MANAGER
>> >
>> > RHV/CNV DevOps
>> >
>> > EMEA VIRTUALIZATION R&D
>> >
>> >
>> > Red Hat EMEA
>> >
>> > TRIED. TESTED. TRUSTED.
>> > phone: +972-9-7692018
>> > irc: eedri (on #tlv #rhev-dev #rhev-integ)
>>
>

-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/MMWXTDQD6BBRPCZEY3BC4LYHRVKNXYGZ/


Re: jenkins is dead

2018-11-21 Thread Barak Korren
 for
> jobs with matching label expression; ‘vm0159.workers-phx.ovirt.org’ is
> reserved for jobs with matching label expression; ‘
> vm0160.workers-phx.ovirt.org’ is reserved for jobs with matching label
> expression; ‘vm0161.workers-phx.ovirt.org’ is reserved for jobs with
> matching label expression; ‘vm0162.workers-phx.ovirt.org’ is reserved for
> jobs with matching label expression; ‘vm0163.workers-phx.ovirt.org’ is
> reserved for jobs with matching label expression; ‘
> vm0164.workers-phx.ovirt.org’ is reserved for jobs with matching label
> expression; ‘vm0165.workers-phx.ovirt.org’ is reserved for jobs with
> matching label expression; ‘vm0200.workers-phx.ovirt.org’ is reserved for
> jobs with matching label expression; ‘vm0201.workers-phx.ovirt.org’ is
> reserved for jobs with matching label expression; ‘
> vm0203.workers-phx.ovirt.org’ is reserved for jobs with matching label
> expression; ‘vm0204.workers-phx.ovirt.org’ is reserved for jobs with
> matching label expression; ‘vm0205.workers-phx.ovirt.org’ is reserved for
> jobs with matching label expression; ‘vm0206.workers-phx.ovirt.org’ is
> reserved for jobs with matching label expression; ‘
> vm0207.workers-phx.ovirt.org’ is reserved for jobs with matching label
> expression; ‘vm0208.workers-phx.ovirt.org’ is reserved for jobs with
> matching label expression; ‘vm0215.workers-phx.ovirt.org’ is reserved for
> jobs with matching label expression; ‘vm0216.workers-phx.ovirt.org’ is
> reserved for jobs with matching label expression; ‘
> vm0217.workers-phx.ovirt.org’ is reserved for jobs with matching label
> expression; ‘vm0218.workers-phx.ovirt.org’ is reserved for jobs with
> matching label expression; ‘vm0219.workers-phx.ovirt.org’ is reserved for
> jobs with matching label expression; ‘vm0220.workers-phx.ovirt.org’ is
> reserved for jobs with matching label expression; ‘
> vm0221.workers-phx.ovirt.org’ is reserved for jobs with matching label
> expression; ‘vm0222.workers-phx.ovirt.org’ is reserved for jobs with
> matching label expression)
>
>
> Looks like we have lot of jenkins workers but we can't use them.
>
>
>
>>
>> Anton.
>>
>>
>> On 21 November 2018 at 10:06:25, Eyal Edri (ee...@redhat.com) wrote:
>> > Thanks for reporting, we got reports on it on IRC as well,
>> > We're looking into it.
>> >
>> > I do see some jobs running, but there seems to be slowness as well,
>> we'll
>> > provide feedback once we have more info.
>> >
>> > On Wed, Nov 21, 2018 at 11:00 AM Michal Skrivanek <
>> > michal.skriva...@redhat.com> wrote:
>> >
>> > > About 20 minutes ago jenkins stopped running any check-patch runs
>> > > Can anyone fix it?
>> > >
>> > > Thanks,
>> > > michal
>> > > ___
>> > > Infra mailing list -- infra@ovirt.org
>> > > To unsubscribe send an email to infra-le...@ovirt.org
>> > > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> > > oVirt Code of Conduct:
>> > > https://www.ovirt.org/community/about/community-guidelines/
>> > > List Archives:
>> > >
>> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/GQXOQBL47K3I7E2G36YUD4DOHXMVRXJV/
>> > >
>> >
>> >
>> > --
>> >
>> > Eyal edri
>> >
>> >
>> > MANAGER
>> >
>> > RHV/CNV DevOps
>> >
>> > EMEA VIRTUALIZATION R&D
>> >
>> >
>> > Red Hat EMEA
>> > TRIED. TESTED. TRUSTED.
>> > phone: +972-9-7692018
>> > irc: eedri (on #tlv #rhev-dev #rhev-integ)
>> > ___
>> > Infra mailing list -- infra@ovirt.org
>> > To unsubscribe send an email to infra-le...@ovirt.org
>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> > oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> > List Archives:
>> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/FYD2ONOL5OFQMRH7H6HUK567D3UGK74V/
>> >
>> ___
>> Infra mailing list -- infra@ovirt.org
>> To unsubscribe send an email to infra-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/3IQJKNIIE2BEGC7S3DFWHKKHB6FBJIHY/
>>
>
>
> --
>
> SANDRO BONAZZOLA
>
> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
>
> Red Hat EMEA <https://www.redhat.com/>
>
> sbona...@redhat.com
> <https://red.ht/sig>
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/NYQ3EFOGVEHVGT47ZT6NIVRFQPP6PYKF/
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/AC33S3ER2VPGGAEUJCOS45MADEYRF3O6/


Re: [CQ]: 95559,13 (vdsm) failed "ovirt-master" system tests

2018-11-28 Thread Barak Korren
בתאריך יום ד׳, 28 בנוב׳ 2018, 23:12, מאת Dafna Ron :

>
>
> On Wed, Nov 28, 2018 at 9:01 PM Nir Soffer  wrote:
>
>> On Wed, Nov 28, 2018 at 10:47 PM Nir Soffer  wrote:
>>
>>> On Wed, Nov 28, 2018 at 10:32 PM Dafna Ron  wrote:
>>>
 1. it did not break ost. one failed CQ run on for one project does not
 mean that ost is broken :)

>>>
>>> Ok, broke the change queue :-)
>>>
>>>
 2. the build was reported as failed even though there was no actual
 failure as CQ did not even start to run. if you look at the error, you can
 see that CQ actually exited because the package was failed to be build. so
 no package -> no CQ run.

 *18:25:48* vdsm_standard-on-merge (33) failed building

 The vdsm build is failing with this error;

 *20:04:31* [build-artifacts.fc28.s390x] *** WARNING: mangling shebang in 
 /usr/libexec/vdsm/hooks/before_nic_hotplug/50_macspoof from 
 #!/usr/bin/python to #!/usr/bin/python2. This will become an ERROR, fix it 
 manually!*20:04:31* [build-artifacts.fc28.s390x] *** WARNING: mangling 
 shebang in /usr/libexec/vdsm/hooks/before_device_create/50_macspoof from 
 #!/usr/bin/python to #!/usr/bin/python2. This will become an ERROR, fix it 
 manually!*20:04:32* [build-artifacts.fc28.s390x] *** WARNING: mangling 
 shebang in /usr/libexec/vdsm/hooks/before_vm_start/50_fileinject from 
 #!/usr/bin/python to #!/usr/bin/python2. This will become an ERROR, fix it 
 manually!

 This looks like a warning not an error.
>>>
>>> Maybe the stdci change added fedora 28 build on s390x, that was not
>>> enabled before the patch?
>>>
>>
>> Checking gerrit, we see:
>>
>> http://jenkins.ovirt.org/job/vdsm_standard-on-merge/33/ : FAILURE
>>
>
> This is what CQ is waiting on so as long as it fails CQ will not run.
>
>
>> http://jenkins.ovirt.org/job/vdsm_master_check-merged-el7-x86_64/3939/ :
>> FAILURE
>> check merged is failing most of the time, we can safely ignore it.
>>
>> http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-fc28-x86_64/343/
>> : SUCCESS
>>
>> http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-el7-x86_64/4507/
>> : SUCCESS
>>
>> http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-el7-ppc64le/3925/
>> : SUCCESS
>>
>> http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-fc28-s390x/208/
>> : SUCCESS
>> So build artifacts succeeded with all platform/distros.
>>
>> http://jenkins.ovirt.org/job/standard-enqueue/17729/ : This change was
>> successfully submitted to the change queue(s) for system testing.
>> And the patch was added to the change queue.
>> Based on this I expect OST to use the packages built in gerrit.
>>
> As I said, and as you can see in the logs, the build-artifacts is not what
> CQ is expecting for in stdci v2 so it will not run as long as standard on
> merge fails.
> looking again using blue ocean we can see the cause of failure is
> check-merged:
> https://jenkins.ovirt.org/blue/organizations/jenkins/vdsm_standard-on-merge/detail/vdsm_standard-on-merge/35/pipeline/132
> As long as it fails the build will fail and CQ will no be able to run.
>
>>
>> We need more info about this failure.
>>
>
Please note that vdsm currently has both v1 and v2 jobs, so changes are
being submitted to the CQ twice. Its perfectly possible that the change
made it through the CQ via the V1 jobs.

It check-merged  is an issue for now, you can just exclude it from running
in stdci.yaml.



>> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/ZPAZ7LNENB4B46GZYBPMD4WZQAB5KRVR/
>
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/QJ5L7EK2QG4UM3SDLD6VL2OE5JLRHG2S/


Re: S390x machine problem

2018-11-30 Thread Barak Korren
Probably the nock cache again...

We can't clean it with our scripts because we don't have sudo there.
Instead we need to do it via mock by bound-mounting the cache directly into
the chroot.


בתאריך יום ו׳, 30 בנוב׳ 2018, 10:45, מאת Ehud Yonasi :

> Seems like space issue.
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/AXMPS3IXPTQFNP3WZPBMOMYKBQMMTXZG/
>
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/O5U2DAE7IMLBJK3RGDBC7KYCTCOC67TM/


Space issues on lfedora1.lf-dev.marist.edu

2018-12-01 Thread Barak Korren
Hi Dan,

How are you.

As you know we've been using `lfedora1.lf-dev.marist.edu` to generate s390x
build of oVirt.

We've recently seen some failures that have to do with running our of space
on the node. Some of this seems to be our fault, as clearing up stale mock
chroots we created freed up about 14G, but after doing that I still see
there are 48G used there (Is the OS image that big?).
Can some more space be cleared up on the node? Could we perhaps have the
disk space increased there?

I was also wondering, could we setup and use Docker on that node? We've
been switching to using containers on our regular CI nodes, and it'd be a
shame to leave s390x behind...

Thanks,
Barak.

-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/CVJPFB77VC5SVABBHOWSOP3ILE6MMEWT/


Re: [CQ]: 252c21d (ovirt-ansible-engine-setup) failed "ovirt-master" system tests

2018-12-03 Thread Barak Korren
On Mon, 3 Dec 2018 at 13:35, Dafna Ron  wrote:

> CQ did not run because of a failed build-artifacts:
>
>
> https://jenkins.ovirt.org/job/oVirt_ovirt-ansible-engine-setup_standard-on-ghpush/44/consoleFull
>
> It seems the issue may be a configuration problem:
>
> *09:30:49* [build-artifacts.el7.x86_64] ‘build-artifacts.el7.x86_64/**’ 
> doesn’t match anything, but ‘**’ does. Perhaps that’s what you 
> mean?*09:30:49* [build-artifacts.el7.x86_64] No artifacts found that match 
> the file pattern "build-artifacts.el7.x86_64/**". Configuration error?
>
>
That is not the issue, this is:

*09:30:39* [build-artifacts.el7.x86_64] sh:
/home/jenkins/workspace/oVirt_ovirt-ansible-engine-setup_standard-on-ghpush/ovirt-ansible-engine-setup@tmp/durable-80ef2dc7/jenkins-result.txt.tmp:
No such file or directory*09:30:39* [build-artifacts.el7.x86_64] mv:
cannot stat 
'/home/jenkins/workspace/oVirt_ovirt-ansible-engine-setup_standard-on-ghpush/ovirt-ansible-engine-setup@tmp/durable-80ef2dc7/jenkins-result.txt.tmp':
No such file or directory


As if the runtime files were somehow deleted from the slave while the job
is running.

Is anyone doing manual maintenance on nodes right now?

@dafna: did you re-trigger the job already?



>
> On Fri, Nov 30, 2018 at 8:42 PM oVirt Jenkins  wrote:
>
>> Change 252c21d (ovirt-ansible-engine-setup) is probably the reason behind
>> recent system test failures in the "ovirt-master" change queue and needs
>> to be
>> fixed.
>>
>> This change had been removed from the testing queue. Artifacts build from
>> this
>> change will not be released until it is fixed.
>>
>> For further details about the change see:
>>
>> https://github.com/oVirt/ovirt-ansible-engine-setup/commit/252c21d708e0e9a15a0793f4d0ef2c38870c76e6
>>
>> For failed test results see:
>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/11740/
>> ___
>> Infra mailing list -- infra@ovirt.org
>> To unsubscribe send an email to infra-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/OGIBMJ4ZOBTQ7QNWJK637KCX3VEK5QRY/
>>
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/KCVMTRD6B7ADXA44IF6UH2QCLX5N47MZ/
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/YDWP6WE3JAHCYZSQUUHQ346GPWTNO6J7/


Re: Space issues on lfedora1.lf-dev.marist.edu

2018-12-03 Thread Barak Korren
On Mon, 3 Dec 2018 at 10:37, Dan Horák  wrote:

> Hi Barak,
>
> On Sun, 2 Dec 2018 09:50:34 +0200
> Barak Korren  wrote:
>
> > Hi Dan,
> >
> > How are you.
> >
> > As you know we've been using `lfedora1.lf-dev.marist.edu` to generate
> > s390x build of oVirt.
> >
> > We've recently seen some failures that have to do with running our of
> > space on the node. Some of this seems to be our fault, as clearing up
> > stale mock chroots we created freed up about 14G, but after doing
> > that I still see there are 48G used there (Is the OS image that big?).
> > Can some more space be cleared up on the node? Could we perhaps have
> > the disk space increased there?
>
> thanks for info, I'm looking into it. There are multiple users sharing
> the machine, so someone else might have used the all free space :-) How
> easily you could migrate your setup to our second guest (same specs)?
> We could try the containers there.
>

I'd rather keep the current setup as it is, and have it keep working as we
try out the containers. We can remove it once the containers are working
well...

> I was also wondering, could we setup and use Docker on that node?
> > We've been switching to using containers on our regular CI nodes, and
> > it'd be a shame to leave s390x behind...
>
> I have been thinking about containers already as another level of
> interaction. I would prefer podman (and co) for the runtime, it's RH
> preferred technology, doesn't require a daemon and allows
> non-privileged use.
>

I'm all for using podman down the line, but there are a few reasons why we
need docker currently:

   1. All our existing code had been developed and tested on Docker, we
   will switch to podman eventually, but we're not gonna be ready for that in
   the near future.
   2. The main thing we want to do is use the jenkins-docker plugin to spin
   up and remove the containers for us - there is AFAIK is no plugin for
   podman ATM.

WRT non privileged use - we're currently still running mock inside the
container, so we need it to be privileged...


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/BUJRB72OTOFEIY3SU6XEFOHKF4RIV4ZD/


Re: [CQ]: 252c21d (ovirt-ansible-engine-setup) failed "ovirt-master" system tests

2018-12-03 Thread Barak Korren
On Mon, 3 Dec 2018 at 14:16, Dafna Ron  wrote:

>
>
> On Mon, Dec 3, 2018 at 11:54 AM Barak Korren  wrote:
>
>>
>>
>> On Mon, 3 Dec 2018 at 13:35, Dafna Ron  wrote:
>>
>>> CQ did not run because of a failed build-artifacts:
>>>
>>>
>>> https://jenkins.ovirt.org/job/oVirt_ovirt-ansible-engine-setup_standard-on-ghpush/44/consoleFull
>>>
>>> It seems the issue may be a configuration problem:
>>>
>>> *09:30:49* [build-artifacts.el7.x86_64] ‘build-artifacts.el7.x86_64/**’ 
>>> doesn’t match anything, but ‘**’ does. Perhaps that’s what you 
>>> mean?*09:30:49* [build-artifacts.el7.x86_64] No artifacts found that match 
>>> the file pattern "build-artifacts.el7.x86_64/**". Configuration error?
>>>
>>>
>> That is not the issue, this is:
>>
>> *09:30:39* [build-artifacts.el7.x86_64] sh: 
>> /home/jenkins/workspace/oVirt_ovirt-ansible-engine-setup_standard-on-ghpush/ovirt-ansible-engine-setup@tmp/durable-80ef2dc7/jenkins-result.txt.tmp:
>>  No such file or directory*09:30:39* [build-artifacts.el7.x86_64] mv: cannot 
>> stat 
>> '/home/jenkins/workspace/oVirt_ovirt-ansible-engine-setup_standard-on-ghpush/ovirt-ansible-engine-setup@tmp/durable-80ef2dc7/jenkins-result.txt.tmp':
>>  No such file or directory
>>
>>
>> As if the runtime files were somehow deleted from the slave while the job
>> is running.
>>
>
> We had the same thing happen last week:
> https://ovirt-jira.atlassian.net/browse/OVIRT-2587
>
> Is anyone doing manual maintenance on nodes right now?
>
> Not me. Adding Evgheni, Daniel and Gal
>
> @dafna: did you re-trigger the job already?
>
> Not yet. is it ok to do it now or would you like me to wait?
>

no, always rerun on infra issues ASAP.



>
>
>
>>
>> On Fri, Nov 30, 2018 at 8:42 PM oVirt Jenkins  wrote:
>>
>>> Change 252c21d (ovirt-ansible-engine-setup) is probably the reason behind
>>> recent system test failures in the "ovirt-master" change queue and needs
>>> to be
>>> fixed.
>>>
>>> This change had been removed from the testing queue. Artifacts build
>>> from this
>>> change will not be released until it is fixed.
>>>
>>> For further details about the change see:
>>>
>>> https://github.com/oVirt/ovirt-ansible-engine-setup/commit/252c21d708e0e9a15a0793f4d0ef2c38870c76e6
>>>
>>> For failed test results see:
>>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/11740/
>>> ___
>>> Infra mailing list -- infra@ovirt.org
>>> To unsubscribe send an email to infra-le...@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/OGIBMJ4ZOBTQ7QNWJK637KCX3VEK5QRY/
>>>
>> _______
>> Infra mailing list -- infra@ovirt.org
>> To unsubscribe send an email to infra-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/KCVMTRD6B7ADXA44IF6UH2QCLX5N47MZ/
>>
>
>>
>> --
>> Barak Korren
>> RHV DevOps team , RHCE, RHCi
>> Red Hat EMEA
>> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>>
>

-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/ZQCSCEIV6UEV4D5JNHV2CTVNJJMNJBDN/


Re: Space issues on lfedora1.lf-dev.marist.edu

2018-12-03 Thread Barak Korren
בתאריך יום ב׳, 3 בדצמ׳ 2018, 15:22, מאת Dan Horák :

> On Mon, 3 Dec 2018 14:24:00 +0200
> Barak Korren  wrote:
>
> > On Mon, 3 Dec 2018 at 10:37, Dan Horák  wrote:
> >
> > > Hi Barak,
> > >
> > > On Sun, 2 Dec 2018 09:50:34 +0200
> > > Barak Korren  wrote:
> > >
> > > > Hi Dan,
> > > >
> > > > How are you.
> > > >
> > > > As you know we've been using `lfedora1.lf-dev.marist.edu` to
> > > > generate s390x build of oVirt.
> > > >
> > > > We've recently seen some failures that have to do with running
> > > > our of space on the node. Some of this seems to be our fault, as
> > > > clearing up stale mock chroots we created freed up about 14G, but
> > > > after doing that I still see there are 48G used there (Is the OS
> > > > image that big?). Can some more space be cleared up on the node?
> > > > Could we perhaps have the disk space increased there?
> > >
> > > thanks for info, I'm looking into it. There are multiple users
> > > sharing the machine, so someone else might have used the all free
> > > space :-) How easily you could migrate your setup to our second
> > > guest (same specs)? We could try the containers there.
> > >
> >
> > I'd rather keep the current setup as it is, and have it keep working
> > as we try out the containers. We can remove it once the containers
> > are working well...
>
> ok, makes sense
>
> I've already removed some old cached data, so jobs on the guest should
> work again. I'm going update and reboot the guest, sometimes there are
> removed, but not closed, files reducing the free disk space.
>
> > > I was also wondering, could we setup and use Docker on that node?
> > > > We've been switching to using containers on our regular CI nodes,
> > > > and it'd be a shame to leave s390x behind...
> > >
> > > I have been thinking about containers already as another level of
> > > interaction. I would prefer podman (and co) for the runtime, it's RH
> > > preferred technology, doesn't require a daemon and allows
> > > non-privileged use.
> > >
> >
> > I'm all for using podman down the line, but there are a few reasons
> > why we need docker currently:
> >
> >1. All our existing code had been developed and tested on Docker,
> > we will switch to podman eventually, but we're not gonna be ready for
> > that in the near future.
> >2. The main thing we want to do is use the jenkins-docker plugin
> > to spin up and remove the containers for us - there is AFAIK is no
> > plugin for podman ATM.
>
> might be worth to let the podman team know about
>

I think I heard this idea been floated around, but no actual work going
on... I'd rather not hold my breath...


> > WRT non privileged use - we're currently still running mock inside the
> > container, so we need it to be privileged...
>
> AFAIK podman gives you a root user in the container even when you start
> the container as a regular user, which is why I like it for shared
> machines like this.
>

I understand perfectly... But given that it still has some maturing to do,
and I'd rather be able to use containers sooner then later, would you mind
trusting us not to break the machine for a while?
(We've been nice citizens so far no?)

(Moving to the containers across the board is the 1st step in deprecating a
lot of old code with have around mock and I'd hate to have to keep it just
for s390x now that I know Docker is available...)


> Dan
>
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/PC54JDQR4JGJ4GO2WHSUCGXWNAPFLHYU/


Re: Change in vdsm[master]: spec: baseline qemu for 4.3

2018-12-05 Thread Barak Korren
בתאריך יום ד׳, 5 בדצמ׳ 2018, 18:35, מאת Ehud Yonasi :

> It fails because it runs on vm:
>
> hardware acceleration not available
> * hardware acceleration not available
>   # Start vms: ERROR (in 0:00:01)
>   # Destroy network vdsm_functional_tests_lago:
>   # Destroy network vdsm_functional_tests_lago: Success (in 0:00:00)
> @ Start specified VMs: ERROR (in 0:00:07)
> kvm executable not found
> + prepare_and_copy_yum_conf
>
> You will need to add to the stdci yaml file bare metal requirements for
> lago.
> runtime-requirements:
>   support-nesting-level: 2
>


No that's not it.
It seems to have failed because the localrepo port was taken. Perhaps
someone manually aborted a previous job and did not cleanup left over lago
processes?


>
> On Wed, Dec 5, 2018 at 6:07 PM Michal Skrivanek <
> michal.skriva...@redhat.com> wrote:
>
>> the error seems totally unrelated to any test, it seems it’s failing
>> during lago deployment?. Can you please check that?
>>
>> *15:37:55* + lago ovirt deploy*15:37:56* @ Deploy oVirt environment: 
>> *15:37:56* @ Deploy oVirt environment: ERROR (in 0:00:00)*15:37:56* Error 
>> occured, aborting*15:37:56* Traceback (most recent call last):*15:37:56*   
>> File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 383, in 
>> do_run*15:37:56* self.cli_plugins[args.ovirtverb].do_run(args)*15:37:56* 
>>   File "/usr/lib/python2.7/site-packages/lago/plugins/cli.py", line 184, in 
>> do_run*15:37:56* self._do_run(**vars(args))*15:37:56*   File 
>> "/usr/lib/python2.7/site-packages/lago/utils.py", line 505, in 
>> wrapper*15:37:56* return func(*args, **kwargs)*15:37:56*   File 
>> "/usr/lib/python2.7/site-packages/lago/utils.py", line 516, in 
>> wrapper*15:37:56* return func(*args, prefix=prefix, **kwargs)*15:37:56*  
>>  File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 181, in 
>> do_deploy*15:37:56* prefix.deploy()*15:37:56*   File 
>> "/usr/lib/python2.7/site-packages/lago/log_utils.py", line 636, in 
>> wrapper*15:37:56* return func(*args, **kwargs)*15:37:56*   File 
>> "/usr/lib/python2.7/site-packages/ovirtlago/reposetup.py", line 125, in 
>> wrapper*15:37:56* root_dir=prefix.paths.internal_repo(),*15:37:56*   
>> File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__*15:37:56*   
>>   return self.gen.next()*15:37:56*   File 
>> "/usr/lib/python2.7/site-packages/ovirtlago/server.py", line 148, in 
>> repo_server_context*15:37:56* root_dir=root_dir,*15:37:56*   File 
>> "/usr/lib/python2.7/site-packages/ovirtlago/server.py", line 127, in 
>> _create_http_server*15:37:56* 
>> generate_request_handler(root_dir),*15:37:56*   File 
>> "/usr/lib/python2.7/site-packages/ovirtlago/server.py", line 60, in 
>> __init__*15:37:56* ThreadingTCPServer.__init__(self, server_address, 
>> RequestHandlerClass)*15:37:56*   File 
>> "/usr/lib64/python2.7/SocketServer.py", line 419, in __init__*15:37:56* 
>> self.server_bind()*15:37:56*   File "/usr/lib64/python2.7/SocketServer.py", 
>> line 430, in server_bind*15:37:56* 
>> self.socket.bind(self.server_address)*15:37:56*   File 
>> "/usr/lib64/python2.7/socket.py", line 224, in meth*15:37:56* return 
>> getattr(self._sock,name)(*args)*15:37:56* error: [Errno 98] Address already 
>> in use
>>
>>
>>
>>
>> Begin forwarded message:
>>
>> *From: *Code Review 
>> *Subject: **Change in vdsm[master]: spec: baseline qemu for 4.3*
>> *Date: *5 December 2018 at 16:05:40 CET
>> *To: *Michal Skrivanek 
>> *Reply-To: *jenk...@ovirt.org, michal.skriva...@redhat.com
>>
>> Jenkins CI *posted comments* on this change.
>>
>> View Change 
>>
>> Patch set 22:
>>
>> Build Failed
>>
>> http://jenkins.ovirt.org/job/vdsm_standard-on-merge/71/ : FAILURE
>>
>> http://jenkins.ovirt.org/job/vdsm_master_check-merged-el7-x86_64/3980/ :
>> FAILURE
>>
>> http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-el7-x86_64/4548/
>> : SUCCESS
>>
>> http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-fc28-s390x/250/
>> : SUCCESS
>>
>> http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-el7-ppc64le/3966/
>> : SUCCESS
>>
>> http://jenkins.ovirt.org/job/standard-enqueue/17836/ :
>> This change was successfully submitted to the change queue(s) for system
>> testing.
>>
>>
>> http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-fc28-x86_64/384/
>> : SUCCESS
>>
>>
>> To view, visit change 95518 . To
>> unsubscribe, visit settings .
>> Gerrit-Project: vdsm
>> Gerrit-Branch: master
>> Gerrit-MessageType: comment
>> Gerrit-Change-Id: I9ea583aa9deea5906a4e5a93c40c0f258fdbe02a
>> Gerrit-Change-Number: 95518
>> Gerrit-PatchSet: 22
>> Gerrit-Owner: Michal Skrivanek 
>> Gerrit-Reviewer: Francesco Romani 
>> Gerrit-Reviewer: Jenkins CI 
>> Gerrit-Reviewer: Michal Skrivanek 
>> Gerrit-Reviewer: Milan Zamazal 
>> Gerrit-Reviewer: Ryan Barry 
>> Gerrit-Reviewer: Sandro Bonazzola 
>> Gerrit-Reviewer: gerrit-ho

Re: Change in vdsm[master]: spec: baseline qemu for 4.3

2018-12-05 Thread Barak Korren
בתאריך יום ד׳, 5 בדצמ׳ 2018, 19:11, מאת Gal Ben Haim :

> There was a "lago ovirt deploy" command stuck on "
> vm0096.workers-phx.ovirt.org".
> My guess is that a previous run of vdsm check-merged got a timeout from
> Jenkins during "lago ovirt deploy",
> which caused the job to finish ungracefully (we now that Jenkins use
> SIGKILL in order to stop jobs).
> My suggestion is to wrap vdsm's check-merge with the "timeout" command
> (like we did in OST).
>


Actually one of the imrovements we implemented because of CNV was to
enforce the timeout from inside rather then outside mock, so individual
projects do not need to do it on their own.

At this point it's almost certain this was either a manual abort or a
Jenkins reboot.


>
> On Wed, Dec 5, 2018 at 6:56 PM Barak Korren  wrote:
>
>>
>>
>> בתאריך יום ד׳, 5 בדצמ׳ 2018, 18:35, מאת Ehud Yonasi :
>>
>>> It fails because it runs on vm:
>>>
>>> hardware acceleration not available
>>> * hardware acceleration not available
>>>   # Start vms: ERROR (in 0:00:01)
>>>   # Destroy network vdsm_functional_tests_lago:
>>>   # Destroy network vdsm_functional_tests_lago: Success (in 0:00:00)
>>> @ Start specified VMs: ERROR (in 0:00:07)
>>> kvm executable not found
>>> + prepare_and_copy_yum_conf
>>>
>>> You will need to add to the stdci yaml file bare metal requirements for
>>> lago.
>>> runtime-requirements:
>>>   support-nesting-level: 2
>>>
>>
>>
>> No that's not it.
>> It seems to have failed because the localrepo port was taken. Perhaps
>> someone manually aborted a previous job and did not cleanup left over lago
>> processes?
>>
>>
>>>
>>> On Wed, Dec 5, 2018 at 6:07 PM Michal Skrivanek <
>>> michal.skriva...@redhat.com> wrote:
>>>
>>>> the error seems totally unrelated to any test, it seems it’s failing
>>>> during lago deployment?. Can you please check that?
>>>>
>>>> *15:37:55* + lago ovirt deploy*15:37:56* @ Deploy oVirt environment: 
>>>> *15:37:56* @ Deploy oVirt environment: ERROR (in 0:00:00)*15:37:56* Error 
>>>> occured, aborting*15:37:56* Traceback (most recent call last):*15:37:56*   
>>>> File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 383, in 
>>>> do_run*15:37:56* 
>>>> self.cli_plugins[args.ovirtverb].do_run(args)*15:37:56*   File 
>>>> "/usr/lib/python2.7/site-packages/lago/plugins/cli.py", line 184, in 
>>>> do_run*15:37:56* self._do_run(**vars(args))*15:37:56*   File 
>>>> "/usr/lib/python2.7/site-packages/lago/utils.py", line 505, in 
>>>> wrapper*15:37:56* return func(*args, **kwargs)*15:37:56*   File 
>>>> "/usr/lib/python2.7/site-packages/lago/utils.py", line 516, in 
>>>> wrapper*15:37:56* return func(*args, prefix=prefix, 
>>>> **kwargs)*15:37:56*   File 
>>>> "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 181, in 
>>>> do_deploy*15:37:56* prefix.deploy()*15:37:56*   File 
>>>> "/usr/lib/python2.7/site-packages/lago/log_utils.py", line 636, in 
>>>> wrapper*15:37:56* return func(*args, **kwargs)*15:37:56*   File 
>>>> "/usr/lib/python2.7/site-packages/ovirtlago/reposetup.py", line 125, in 
>>>> wrapper*15:37:56* root_dir=prefix.paths.internal_repo(),*15:37:56*   
>>>> File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__*15:37:56* 
>>>> return self.gen.next()*15:37:56*   File 
>>>> "/usr/lib/python2.7/site-packages/ovirtlago/server.py", line 148, in 
>>>> repo_server_context*15:37:56* root_dir=root_dir,*15:37:56*   File 
>>>> "/usr/lib/python2.7/site-packages/ovirtlago/server.py", line 127, in 
>>>> _create_http_server*15:37:56* 
>>>> generate_request_handler(root_dir),*15:37:56*   File 
>>>> "/usr/lib/python2.7/site-packages/ovirtlago/server.py", line 60, in 
>>>> __init__*15:37:56* ThreadingTCPServer.__init__(self, server_address, 
>>>> RequestHandlerClass)*15:37:56*   File 
>>>> "/usr/lib64/python2.7/SocketServer.py", line 419, in __init__*15:37:56*
>>>>  self.server_bind()*15:37:56*   File 
>>>> "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind*15:37:56* 
>>>> self.socket.bind(self.server_address)*15:37:56*   File 
>>>> "

Re: [CQ]: 96120,3 (ovirt-engine) failed "ovirt-master" system tests

2018-12-12 Thread Barak Korren
Huh? Why is opstools coming from CentOS and not from our own mirrors?

Please open a RHV ticket to fix that. The is no reason for an OST run to
EVER get out of the PHX network.

On Wed, 12 Dec 2018 at 13:40, Dafna Ron  wrote:

> adding Evgheni.
> this seems that the mirrors for centos were not responsive
>
>
> https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/11964/consoleFull
>
>
> On Tue, Dec 11, 2018 at 8:58 PM oVirt Jenkins  wrote:
>
>> Change 96120,3 (ovirt-engine) is probably the reason behind recent system
>> test
>> failures in the "ovirt-master" change queue and needs to be fixed.
>>
>> This change had been removed from the testing queue. Artifacts build from
>> this
>> change will not be released until it is fixed.
>>
>> For further details about the change see:
>> https://gerrit.ovirt.org/#/c/96120/3
>>
>> For failed test results see:
>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/11964/
>> ___
>> Infra mailing list -- infra@ovirt.org
>> To unsubscribe send an email to infra-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/TCD2FLFHSBKRO4IITNBPUT6NX7VUHR2C/
>>
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/YIWEGSTOBC573LYJZI2INA7ZNZZLLECS/
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/74F5IM4H3YII2Y5IMGKQVOGD4WSJOMQU/


Re: [CQ]: 96099,7 (vdsm) failed "ovirt-master" system tests

2018-12-12 Thread Barak Korren
On Wed, 12 Dec 2018 at 20:37, Dafna Ron  wrote:

> Barak,
> we are going out for gluster as well?
>

looks like it. the mirrors doc on readthedocs shows you how to query jenkns
for an updated list of mirrors. anything you see on the reposync file that
is not either an oVirt repo that we build or has a mirror is essentially a
bug we need to fix.


> python sudo yum yum-utils --setopt=tsflags=nocontexts*18:09:41* Failed to set 
> locale, defaulting to C*18:09:41* 
> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>  [Errno 14] HTTP Error 503 - Service Unavailable*18:09:41* Trying other 
> mirror.*18:09:41* 
> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>  [Errno 14] HTTP Error 503 - Service Unavailable*18:09:41* Trying other 
> mirror.*18:09:41* 
> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>  [Errno 14] HTTP Error 503 - Service Unavailable*18:09:41* Trying other 
> mirror.*18:09:41* 
> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>  [Errno 14] HTTP Error 503 - Service Unavailable*18:09:41* Trying other 
> mirror.*18:09:41* 
> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>  [Errno 14] HTTP Error 503 - Service Unavailable*18:09:41* Trying other 
> mirror.*18:09:41* 
> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>  [Errno 14] HTTP Error 503 - Service Unavailable*18:09:41* Trying other 
> mirror.*18:09:41* 
> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>  [Errno 14] HTTP Error 503 - Service Unavailable*18:09:41* Trying other 
> mirror.
>
>
> On Wed, Dec 12, 2018 at 6:29 PM oVirt Jenkins  wrote:
>
>> Change 96099,7 (vdsm) is probably the reason behind recent system test
>> failures
>> in the "ovirt-master" change queue and needs to be fixed.
>>
>> This change had been removed from the testing queue. Artifacts build from
>> this
>> change will not be released until it is fixed.
>>
>> For further details about the change see:
>> https://gerrit.ovirt.org/#/c/96099/7
>>
>> For failed test results see:
>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/11977/
>> ___
>> Infra mailing list -- infra@ovirt.org
>> To unsubscribe send an email to infra-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/S5UELL3FXP2SAKJGP3ZE3DJR7K6IGSWI/
>>
>

-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/NMOSUBC353EYLDZ5ZQHC5SMUEKZF57FU/


Re: [JIRA] (OVIRT-2624) gluster missing on mirror cause failure in vdsm build

2018-12-13 Thread Barak Korren
בתאריך יום ה׳, 13 בדצמ׳ 2018, 12:57, מאת Eyal Edri :

> Barak,
> Can we make the mirrors optional so it will fall back to the original
> repo?
>

That is the way it works - the system falls back to the upstream repo if
the mirrors server is not available.

Can a project like VDSM choose if he uses mirrors or not from the settings?
>

Yes. All you need to do is set the repo id to something other then the
mirror name.


> Ehud, please verify that we have gluster 3.12 mirrored for now, but we
> really need to think how to avoid failing on it if we don't have the mirror
> setup yet.
>

The reason we have mirrors is because we can occasionally fail if we don't.
The reason for failed re is not the lack of mirror it's the unreliability
of the upstream repo. The mirror is there to protect you from that.



> On Thu, Dec 13, 2018 at 12:49 PM Dafna Ron (oVirt JIRA) <
> j...@ovirt-jira.atlassian.net> wrote:
>
>> Dafna Ron created OVIRT-2624:
>> 
>>
>>  Summary: gluster missing on mirror cause failure in vdsm
>> build
>>  Key: OVIRT-2624
>>  URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2624
>>  Project: oVirt - virtualization made easy
>>   Issue Type: Bug
>> Reporter: Dafna Ron
>> Assignee: infra
>> Priority: High
>>
>>
>> We failed vdsm build because gluster is missing from local mirrors
>> {noformat}
>> python sudo yum yum-utils --setopt=tsflags=nocontexts
>> 18:09:41 Failed to set locale, defaulting to C
>> 18:09:41
>> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>> [Errno 14] HTTP Error 503 - Service Unavailable
>> 18:09:41 Trying other mirror.
>> 18:09:41
>> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>> [Errno 14] HTTP Error 503 - Service Unavailable
>> 18:09:41 Trying other mirror.
>> 18:09:41
>> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>> [Errno 14] HTTP Error 503 - Service Unavailable
>> 18:09:41 Trying other mirror.
>> 18:09:41
>> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>> [Errno 14] HTTP Error 503 - Service Unavailable
>> 18:09:41 Trying other mirror.
>> 18:09:41
>> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>> [Errno 14] HTTP Error 503 - Service Unavailable
>> 18:09:41 Trying other mirror.
>> 18:09:41
>> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>> [Errno 14] HTTP Error 503 - Service Unavailable
>> 18:09:41 Trying other mirror.
>> 18:09:41
>> http://mirror.centos.org/centos/7/storage/x86_64/gluster-3.12/repodata/repomd.xml:
>> [Errno 14] HTTP Error 503 - Service Unavailable
>> 18:09:41 Trying other mirror.
>>
>> {noformat}
>>
>>  http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/11977/
>>
>>
>>
>>
>> --
>> This message was sent by Atlassian Jira
>> (v1001.0.0-SNAPSHOT#100095)
>> ___
>> Infra mailing list -- infra@ovirt.org
>> To unsubscribe send an email to infra-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/XIRIVZO6QAV3SKKLMVCM7CP4DTSQEOL2/
>>
>
>
> --
>
> Eyal edri
>
>
> MANAGER
>
> RHV/CNV DevOps
>
> EMEA VIRTUALIZATION R&D
>
>
> Red Hat EMEA 
>  TRIED. TESTED. TRUSTED. 
> phone: +972-9-7692018
> irc: eedri (on #tlv #rhev-dev #rhev-integ)
>
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/4L5FB6PBQ42AH6KR6TZUNFWVLXEL3347/


Re: [JIRA] (OVIRT-2305) Clean up dead slaves from the staging Jenkins

2018-12-15 Thread Barak Korren
Yes.

בתאריך שבת, 15 בדצמ׳ 2018, 18:02, מאת Eyal Edri (oVirt JIRA) <
j...@ovirt-jira.atlassian.net>:

>
> [
> https://ovirt-jira.atlassian.net/browse/OVIRT-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=38593#comment-38593
> ]
>
> Eyal Edri commented on OVIRT-2305:
> --
>
> still needed?
>
> > Clean up dead slaves from the staging Jenkins
> > -
> >
> > Key: OVIRT-2305
> > URL: https://ovirt-jira.atlassian.net/browse/OVIRT-2305
> > Project: oVirt - virtualization made easy
> >  Issue Type: Outage
> >  Components: Staging infra
> >Reporter: Barak Korren
> >Assignee: infra
> >
>
>
>
>
> --
> This message was sent by Atlassian Jira
> (v1001.0.0-SNAPSHOT#100095)
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/2JHPVHRTWJZXKI4J4QJWUCNFQOPHA4SK/
>
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/Y7FQ4YDN6BNLKVXWF62RCGZSFRLU5IZM/


Re: [JIRA] (OVIRT-1835) Create an automated update mechanism for slaves

2018-12-15 Thread Barak Korren
Even if slaves are ephemeral, you need an update mecahnism for the base
image. Even more so then in non-ephemeral ones...

בתאריך שבת, 15 בדצמ׳ 2018, 17:25, מאת Eyal Edri (oVirt JIRA) <
j...@ovirt-jira.atlassian.net>:

>
> [
> https://ovirt-jira.atlassian.net/browse/OVIRT-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=38569#comment-38569
> ]
>
> Eyal Edri commented on OVIRT-1835:
> --
>
> I'm not sure we need to invest in this if the aim is to more to ephemeral
> slaves, either containers or VMs.
>
> > Create an automated update mechanism for slaves
> > ---
> >
> > Key: OVIRT-1835
> > URL: https://ovirt-jira.atlassian.net/browse/OVIRT-1835
> > Project: oVirt - virtualization made easy
> >  Issue Type: Improvement
> >  Components: Jenkins Slaves
> >Reporter: Barak Korren
> >Assignee: infra
> >  Labels: slaves
> >
> > While we already have an automated mechanism for controlling the YUM/DNF
> repo configuration on slaves, and we also already have a way to install or
> update components that are critical to the CI system (Both are provided by
> '{{global-setup.sh}}'), we're lacking a mechanism that can apply general
> system updates, and more importantly, kernel updates to slaves.
> > Some things we'd like such a mechanism to provide:
> > * Require a minimal amount of manual work (Best case scenario - nothing
> more then merging a repository update patch would be needed).
> > * Test updated slaves before putting them back to production use
> > * Allow rolling back failed updates
>
>
>
> --
> This message was sent by Atlassian Jira
> (v1001.0.0-SNAPSHOT#100095)
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/HQPCHBNAIGFCDURQNUMX234SS3FOPSCK/
>
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/3HB2YYTUA47NHLPLAQCDOVFBXAPKTIUX/


Re: [CQ]: 96040,9 (ovirt-engine) failed "ovirt-master" system tests

2018-12-21 Thread Barak Korren
בתאריך יום ו׳, 21 בדצמ׳ 2018, 12:11, מאת Dafna Ron :

> There is a fix on this: https://gerrit.ovirt.org/#/c/96371/
> there are a few changes ahead of this patch in line so it would take a
> while to do so.
>


No it won't. CQ checks all patches together. If there are no other issues,
it should finish after the current bisection cycle.



On Thu, Dec 20, 2018 at 10:52 PM oVirt Jenkins  wrote:
>
>> Change 96040,9 (ovirt-engine) is probably the reason behind recent system
>> test
>> failures in the "ovirt-master" change queue and needs to be fixed.
>>
>> This change had been removed from the testing queue. Artifacts build from
>> this
>> change will not be released until it is fixed.
>>
>> For further details about the change see:
>> https://gerrit.ovirt.org/#/c/96040/9
>>
>> For failed test results see:
>> http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/12052/
>> ___
>> Infra mailing list -- infra@ovirt.org
>> To unsubscribe send an email to infra-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/IXTP6MW7FMHSEBKS2R3PLMTXET2R745K/
>>
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/FIWYDLG3OQRRZTS4AFR4HZB6NK46SHEE/
>
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/X3QMMIOYPTWU6CHK2XJJ3XM5WTT5EWQS/


Re: Tests failed because global_setup.sh failed

2018-12-25 Thread Barak Korren
On Tue, 25 Dec 2018 at 09:53, Yedidyah Bar David  wrote:

> On Mon, Dec 24, 2018 at 7:49 PM Nir Soffer  wrote:
> >
> > Not sure why global setup failed:
>
> Because of:
>
> + sudo -n systemctl enable postfix
> Failed to execute operation: Connection timed out
> + sudo -n systemctl start postfix
> Failed to start postfix.service: Connection timed out
> See system logs and 'systemctl status postfix.service' for details.
> + failed=true
>
>
Lets have the discussion on the Jira ticket:
https://ovirt-jira.atlassian.net/browse/OVIRT-2636



> Looked a bit and can't find system logs to try and understand why this
> failed.
>
> >
> > + [[ ! -O /home/jenkins/.ssh ]]
> > + [[ ! -G /home/jenkins/.ssh ]]
> > + verify_set_permissions 700 /home/jenkins/.ssh
> > + local target_permissions=700
> > + local path_to_set=/home/jenkins/.ssh
> > ++ stat -c %a /home/jenkins/.ssh
> > + local access=700
> > + [[ 700 != \7\0\0 ]]
> > + return 0
> > + [[ -f /home/jenkins/.ssh/known_hosts ]]
> > + verify_set_ownership /home/jenkins/.ssh/known_hosts
> > + local path_to_set=/home/jenkins/.ssh/known_hosts
> > ++ id -un
> > + local owner=jenkins
> > ++ id -gn
> > + local group=jenkins
> > + [[ ! -O /home/jenkins/.ssh/known_hosts ]]
> > + [[ ! -G /home/jenkins/.ssh/known_hosts ]]
> > + verify_set_permissions 644 /home/jenkins/.ssh/known_hosts
> > + local target_permissions=644
> > + local path_to_set=/home/jenkins/.ssh/known_hosts
> > ++ stat -c %a /home/jenkins/.ssh/known_hosts
> > + local access=644
> > + [[ 644 != \6\4\4 ]]
> > + return 0
> > + return 0
> > + true
> > + log ERROR Aborting.
> >
> > Build:
> >
> https://jenkins.ovirt.org/blue/rest/organizations/jenkins/pipelines/vdsm_standard-check-patch/runs/1048/nodes/125/steps/479/log/?start=0
>
> I found above in this log, but do not see this log in the artifacts:
>
> https://jenkins.ovirt.org/job/vdsm_standard-check-patch/1084/
>
> I (still?) do not know blue ocean well enough, so far I found it hard
> to understand and find stuff there.
>
> CI team: please try to make searches easier. Ideally, I'd like in above
> link (to a specific build of a specific job) to have a search box for that
> build, that searches in everything created by that build - perhaps not
> only artifacts, if above output from global_setup is not considered an
> artifact. Thanks.
>

You can create an RFE...


>
> Best regards,
> --
> Didi
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/ZWL7RLJY2L4QJEYD56JQ3HMMOWP7TSAL/


Re: credentials to ovirt Jenkins

2019-01-07 Thread Barak Korren
On Mon, 7 Jan 2019 at 15:06, Daniel Erez  wrote:

> Hi,
>
> Can you please add credentials to my user (derez) for re-triggering
> Jenkins jobs.
>

Please send to infra-support to open a ticket,

But if the project is using STDCI V2 - you can also re-trigger by posting
'ci test please' on the patch/PR.


>
> Thanks,
> Daniel
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/VLEZS5X67U63JM4QCTNEE2NZXAAF76SQ/
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/KCRHTPIQYP2STXRST5MCX6MSP5CWPIZY/


Offline slaves - staying offline for long periods of time

2019-01-16 Thread Barak Korren
Hi all,

I've spent some time over the last two days going over all our slaves that
were offline and bringing them back online.

Some slaves were offline because of obvious technical issues (Usually
having to do with disk space), and I've fixed those issues as well as wrote
some patches to automatically resolve them in the future.

Other slaves were put offline manually by people with comments that either
point to stalled Jira tickets or simply make some general suggestions to
clean up. In many cases those slaves were offline for a few months or in
some cases over a year.

Given the frequency in which we've seen our system used at full capacity
recently, I must urge people to avoid doing this. If you find a troublesome
slave please do one of the following:

   1. Resolve the technical issues and restore the slave to full working
   order
   2. Prove that the issue in question cannot be easily reproduced and
   restore the slave to full working order.
   3. Re install the slave from scratch
   4. (As a last resort) Open an urgent ticket to investigate and resolve
   the issue and *follow up* on it.

Any any case pleas make an effort to avoid having the slave remain offline
more then 2-3 days and having more then 1-2 slave offline.

@Evgheni Dereveanchin  - do you think we can setup
some monitoring in Nagios so we get alerts if too many slaves are offline
or we have slaves offline for too long?

-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/HRWC3RCV5LI7SHSUAXTAANQHJJDUMVDZ/


Re: Kubvirt jobs stuck for one day

2019-01-17 Thread Barak Korren
Your projects are not sharing resources with the kubevirt-ci jobs.

בתאריך יום ה׳, 17 בינו׳ 2019, 17:55, מאת Nir Soffer :

> While checking why the CI is overloaded, I fund these stuck jobs:
>  openshift-kubevirt-org-kqjj4
> 
> (offline)
> 1
> kubevirt_containerized-data-importer_standard-check-pr
> 
> #1096 containerized-data-importer [check-patch]
> 
>  (check-patch.openshift-3.11.0-release.el7.x86_64)
>  openshift-kubevirt-org-t1tbp
> 
> (offline)
> 1
> kubevirt_containerized-data-importer_standard-check-pr
> 
> #1096 containerized-data-importer [check-patch]
> 
>  (check-patch.k8s-1.11.0-release.el7.x86_64)
> See https://jenkins.ovirt.org/
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/J4DMITGCLPGC7INEIU6E76VP3DH4UZEO/
>
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/Z6GMUR6HGVYMJFKIVJKJKK32EQGIEEDR/


Re: DNF is used on CentOS CI by default

2019-01-21 Thread Barak Korren
On Mon, 21 Jan 2019 at 12:53, Yedidyah Bar David  wrote:

> Hi all,
>
> I now noticed that at least in [1], otopi found and used DNF on
> CentOS. Meaning, we install at least enough of dnf there to make otopi
> be able to use it. Was this done on purpose? Or is this an unplanned
> result of a recent upgrade? Something else?
>
> [1] https://jenkins.ovirt.org/job/otopi_master_check-patch-el7-x86_64/506/
>
>
We did not add DNF intentionally, if its there, and not in your `*.package`
file, is means it became a dependency of something in the `@buildsys-build`
package group that we install in mock.



> --
> Didi
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/WTURJMU7NJJKOVYYBWAMGDLO627WEB4H/
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/PQYGCK2OLLVGGIRXLO2RFRM4EDO3SVRT/


Re: master snapshot repo is empty (was: Re: failed engine check_patch)

2019-02-03 Thread Barak Korren
On Sun, 3 Feb 2019 at 15:06, Yedidyah Bar David  wrote:

> On Sun, Feb 3, 2019 at 2:27 PM Eitan Raviv  wrote:
> >
> > hi Didi,
> > I have several  failing patches on the following:
> >
> https://jenkins.ovirt.org/job/ovirt-engine_master_check-patch-el7-x86_64/47750/console
> >
> https://jenkins.ovirt.org/job/ovirt-engine_master_check-patch-el7-x86_64/47749/console
> >
> https://jenkins.ovirt.org/job/ovirt-engine_master_check-patch-el7-x86_64/47746/console
> > any idea?
> > thanks
> >
> > 11:57:15 = test session starts
> ==
> > 11:57:15 platform linux2 -- Python 2.7.5 -- py-1.4.32 -- pytest-2.7.0
> > 11:57:15 rootdir:
> /home/jenkins/workspace/ovirt-engine_master_check-patch-el7-x86_64/ovirt-engine/packaging/setup,
> inifile:
> > 11:57:16 collected 0 items / 1 errors
> > 11:57:16
> > 11:57:16  ERRORS
> 
> > 11:57:16 ___ ERROR collecting
> tests/ovirt_engine_setup/engine_common/test_database.py ___
> > 11:57:16
> packaging/setup/tests/ovirt_engine_setup/engine_common/test_database.py:19:
> in 
> > 11:57:16 import ovirt_engine_setup.engine_common.database as
> under_test  # isort:skip # noqa: E402
> > 11:57:16
> packaging/setup/ovirt_engine_setup/engine_common/database.py:29: in 
> > 11:57:16 from otopi import base
> > 11:57:16 E   ImportError: No module named otopi
> > 11:57:16 === 1 error in 0.44 seconds
> 
> >
> >
>
> https://resources.ovirt.org/pub/ovirt-master-snapshot/rpm/el7/noarch/
> is empty. Adding infra.
>

Rerunning publisher - this will take a while




> --
> Didi
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/VQZAUT5HSMSFVZVOXS2F6KX3XHIGEZPY/
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/EPAICHRMT5CCHVLQWRQFUH27IDUGFAEE/


Re: master snapshot repo is empty (was: Re: failed engine check_patch)

2019-02-03 Thread Barak Korren
On Sun, 3 Feb 2019 at 15:30, Yedidyah Bar David  wrote:

> On Sun, Feb 3, 2019 at 3:16 PM Barak Korren  wrote:
> >
> >
> >
> > On Sun, 3 Feb 2019 at 15:06, Yedidyah Bar David  wrote:
> >>
> >> On Sun, Feb 3, 2019 at 2:27 PM Eitan Raviv  wrote:
> >> >
> >> > hi Didi,
> >> > I have several  failing patches on the following:
> >> >
> https://jenkins.ovirt.org/job/ovirt-engine_master_check-patch-el7-x86_64/47750/console
> >> >
> https://jenkins.ovirt.org/job/ovirt-engine_master_check-patch-el7-x86_64/47749/console
> >> >
> https://jenkins.ovirt.org/job/ovirt-engine_master_check-patch-el7-x86_64/47746/console
> >> > any idea?
> >> > thanks
> >> >
> >> > 11:57:15 = test session starts
> ==
> >> > 11:57:15 platform linux2 -- Python 2.7.5 -- py-1.4.32 -- pytest-2.7.0
> >> > 11:57:15 rootdir:
> /home/jenkins/workspace/ovirt-engine_master_check-patch-el7-x86_64/ovirt-engine/packaging/setup,
> inifile:
> >> > 11:57:16 collected 0 items / 1 errors
> >> > 11:57:16
> >> > 11:57:16  ERRORS
> 
> >> > 11:57:16 ___ ERROR collecting
> tests/ovirt_engine_setup/engine_common/test_database.py ___
> >> > 11:57:16
> packaging/setup/tests/ovirt_engine_setup/engine_common/test_database.py:19:
> in 
> >> > 11:57:16 import ovirt_engine_setup.engine_common.database as
> under_test  # isort:skip # noqa: E402
> >> > 11:57:16
> packaging/setup/ovirt_engine_setup/engine_common/database.py:29: in 
> >> > 11:57:16 from otopi import base
> >> > 11:57:16 E   ImportError: No module named otopi
> >> > 11:57:16 === 1 error in 0.44 seconds
> 
> >> >
> >> >
> >>
> >> https://resources.ovirt.org/pub/ovirt-master-snapshot/rpm/el7/noarch/
> >> is empty. Adding infra.
> >
> >
> > Rerunning publisher - this will take a while
>
> How long a while?
>
> I also see there an (unannounced, AFAICT) ovirt-4.3-snapshot. Perhaps
> master was accidentally dropped while creating this one? Perhaps it's
> better to copy from it and then run publisher?
>
>
completely unrelated. master should be back now



> >
> >
> >
> >>
> >> --
> >> Didi
> >> ___
> >> Infra mailing list -- infra@ovirt.org
> >> To unsubscribe send an email to infra-le...@ovirt.org
> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> >> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> >> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/VQZAUT5HSMSFVZVOXS2F6KX3XHIGEZPY/
> >
> >
> >
> > --
> > Barak Korren
> > RHV DevOps team , RHCE, RHCi
> > Red Hat EMEA
> > redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>
>
>
> --
> Didi
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/RARH6TNN4KJFLKCUGUICUQJ4SN3FDEYF/


Re: master snapshot repo is empty (was: Re: failed engine check_patch)

2019-02-03 Thread Barak Korren
On Sun, 3 Feb 2019 at 15:53, Yedidyah Bar David  wrote:

> On Sun, Feb 3, 2019 at 3:45 PM Barak Korren  wrote:
> >
> >
> >
> > On Sun, 3 Feb 2019 at 15:30, Yedidyah Bar David  wrote:
> >>
> >> On Sun, Feb 3, 2019 at 3:16 PM Barak Korren  wrote:
> >> >
> >> >
> >> >
> >> > On Sun, 3 Feb 2019 at 15:06, Yedidyah Bar David 
> wrote:
> >> >>
> >> >> On Sun, Feb 3, 2019 at 2:27 PM Eitan Raviv 
> wrote:
> >> >> >
> >> >> > hi Didi,
> >> >> > I have several  failing patches on the following:
> >> >> >
> https://jenkins.ovirt.org/job/ovirt-engine_master_check-patch-el7-x86_64/47750/console
> >> >> >
> https://jenkins.ovirt.org/job/ovirt-engine_master_check-patch-el7-x86_64/47749/console
> >> >> >
> https://jenkins.ovirt.org/job/ovirt-engine_master_check-patch-el7-x86_64/47746/console
> >> >> > any idea?
> >> >> > thanks
> >> >> >
> >> >> > 11:57:15 = test session starts
> ==
> >> >> > 11:57:15 platform linux2 -- Python 2.7.5 -- py-1.4.32 --
> pytest-2.7.0
> >> >> > 11:57:15 rootdir:
> /home/jenkins/workspace/ovirt-engine_master_check-patch-el7-x86_64/ovirt-engine/packaging/setup,
> inifile:
> >> >> > 11:57:16 collected 0 items / 1 errors
> >> >> > 11:57:16
> >> >> > 11:57:16  ERRORS
> 
> >> >> > 11:57:16 ___ ERROR collecting
> tests/ovirt_engine_setup/engine_common/test_database.py ___
> >> >> > 11:57:16
> packaging/setup/tests/ovirt_engine_setup/engine_common/test_database.py:19:
> in 
> >> >> > 11:57:16 import ovirt_engine_setup.engine_common.database as
> under_test  # isort:skip # noqa: E402
> >> >> > 11:57:16
> packaging/setup/ovirt_engine_setup/engine_common/database.py:29: in 
> >> >> > 11:57:16 from otopi import base
> >> >> > 11:57:16 E   ImportError: No module named otopi
> >> >> > 11:57:16 === 1 error in 0.44 seconds
> 
> >> >> >
> >> >> >
> >> >>
> >> >>
> https://resources.ovirt.org/pub/ovirt-master-snapshot/rpm/el7/noarch/
> >> >> is empty. Adding infra.
> >> >
> >> >
> >> > Rerunning publisher - this will take a while
> >>
> >> How long a while?
> >>
> >> I also see there an (unannounced, AFAICT) ovirt-4.3-snapshot. Perhaps
> >> master was accidentally dropped while creating this one? Perhaps it's
> >> better to copy from it and then run publisher?
> >>
> >
> > completely unrelated.
>
> OK
>
> > master should be back now
>
> It's still empty. Perhaps some problem in jenkins?
>


I see it ll there, check your caches...



> >
> >
> >>
> >> >
> >> >
> >> >
> >> >>
> >> >> --
> >> >> Didi
> >> >> ___
> >> >> Infra mailing list -- infra@ovirt.org
> >> >> To unsubscribe send an email to infra-le...@ovirt.org
> >> >> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> >> >> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> >> >> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/VQZAUT5HSMSFVZVOXS2F6KX3XHIGEZPY/
> >> >
> >> >
> >> >
> >> > --
> >> > Barak Korren
> >> > RHV DevOps team , RHCE, RHCi
> >> > Red Hat EMEA
> >> > redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
> >>
> >>
> >>
> >> --
> >> Didi
> >
> >
> >
> > --
> > Barak Korren
> > RHV DevOps team , RHCE, RHCi
> > Red Hat EMEA
> > redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
>
>
>
> --
> Didi
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/XZYONY5ZBHLDLWXUBSE2KGSQY77QBQ2K/


Re: Rename GitHub repo ovirt-ansible-v2v-conversion-host

2019-02-05 Thread Barak Korren
On Tue, 5 Feb 2019 at 12:59, Sandro Bonazzola  wrote:

>
>
> Il giorno mar 5 feb 2019 alle ore 11:54 Tomáš Golembiovský <
> tgole...@redhat.com> ha scritto:
>
>> Hi,
>>
>> I would like to rename the GitHub repo ovirt-ansible-v2v-conversion-host
>> to v2v-conversion-host. Is there a list of things I should do before
>> or after the operation? Will the automation still work?
>>
>>
> Letting infra to answer about this.
>

Before you rename - you need to change the project name in the YAML in the
'jenkins' repo so that you'll get jobs that listen on the new name.


> Adding Ondra and Martin, didn't we have a rule for repository naming
> related to ansible roles?
>
>
>
>> Tomas
>>
>> --
>> Tomáš Golembiovský 
>>
>
>
> --
>
> SANDRO BONAZZOLA
>
> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
>
> Red Hat EMEA <https://www.redhat.com/>
>
> sbona...@redhat.com
> <https://red.ht/sig>
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/JOLIWOBZNIZGR77TAOOMAODEZVXIMXPH/
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/D7H2SBYVL2NAV2NN22MSJZ7SNHTHI526/


Re: INVALID_SERVICE: ovirtlago

2019-02-24 Thread Barak Korren
That is not the real issue, the real issue seems to be this:

+ sudo -n systemctl start docker
Job for docker.service failed because the control process exited with error
code. See "systemctl status docker.service" and "journalctl -xe" for
details.
+ sudo -n systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor
preset: disabled)
   Active: activating (auto-restart) (Result: exit-code) since Mon
2019-02-25 04:03:52 UTC; 45ms ago
 Docs: https://docs.docker.com
  Process: 15496 ExecStart=/usr/bin/dockerd -H fd:// (code=exited,
status=1/FAILURE)
 Main PID: 15496 (code=exited, status=1/FAILURE)
Feb 25 04:03:52 openshift-integ-tests-container-6bmr3 systemd[1]: Failed to
start Docker Application Container Engine.
Feb 25 04:03:52 openshift-integ-tests-container-6bmr3 systemd[1]: Unit
docker.service entered failed state.
Feb 25 04:03:52 openshift-integ-tests-container-6bmr3 systemd[1]:
docker.service failed.
+ :
+ log ERROR 'Failed to start docker service'
+ local level=ERROR


So docker is failing to start in the integ-test container. Here is the
podspec that was used:

---apiVersion: v1kind: Podmetadata:  generateName: jenkins-slave
labels:integ-tests-container: ""  namespace:
jenkins-ovirt-orgspec:  containers:- env:- name:
JENKINS_AGENT_WORKDIR  value: /home/jenkins- name:
CI_RUNTIME_UNAME  value: jenkins- name:
STDCI_SLAVE_CONTAINER_NAME  value: im_a_container-
name: CONTAINER_SLOTS  value: /var/lib/stdci  image:
docker.io/ovirtinfra/el7-runner-node:12c9f471a6e9eccd6d5052c6c4964fff3b6670c9
 command: ['/usr/sbin/init']  livenessProbe:exec:
command: ['systemctl', 'status', 'multi-user.target']
initialDelaySeconds: 360periodSeconds: 7200  name: jnlp
  resources:limits:  memory: 32Girequests:
 memory: 32Gi  securityContext:privileged: true
volumeMounts:- mountPath: /var/lib/stdci  name:
slave-cache- mountPath: /dev/shm  name: dshm
workingDir: /home/jenkins  tty: true  nodeSelector:model: r620
 serviceAccount: jenkins-slave  volumes:- hostPath:path:
/var/lib/stdcitype: DirectoryOrCreate  name: slave-cache
 - emptyDir:medium: Memory  name: dshm


Adding Gal and infra list.


On Mon, 25 Feb 2019 at 08:45, Eitan Raviv  wrote:

> Hi,
> I have some OST patches failing on:
>
> *04:03:53* Error: INVALID_SERVICE: ovirtlago
>
> e.g. 
> https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/3443/consoleFull
>
> I am fully rebased on ost master.
>
> Can you have a look?
>
> Thank you
>
>
> -- Forwarded message -
> From: Galit Rosenthal 
> Date: Mon, Feb 25, 2019 at 8:35 AM
> Subject: Re: INVALID_SERVICE: ovirtlago
> To: Eitan Raviv 
>
>
> I think you should consult Barak
>
> On Sun, Feb 24, 2019 at 8:26 PM Eitan Raviv  wrote:
>
>> *13:58:57* ++ sudo -n firewall-cmd --query-service=ovirtlago*13:58:58* 
>> Error: INVALID_SERVICE: ovirtlago
>>
>> https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/3430/consoleFull
>>
>>
>
> --
>
> GALIT ROSENTHAL
>
> SOFTWARE ENGINEER
>
> Red Hat
>
> <https://www.redhat.com/>
>
> ga...@gmail.comT: 972-9-7692230
> <https://red.ht/sig>
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/CYPPTQFSNMZXNAXIS44QMFSXO6UM3KHK/


Re: INVALID_SERVICE: ovirtlago

2019-02-26 Thread Barak Korren
On Tue, 26 Feb 2019 at 18:25, Gal Ben Haim  wrote:

> I've run 48 suites, and the issue didn't appear.
> I suggest merging [1], which will help us to understand the cause of
> the problem.
>
> [1] https://gerrit.ovirt.org/#/c/98048/
>


merged.


>
> On Tue, Feb 26, 2019 at 2:43 PM Eitan Raviv  wrote:
>
>> certainly more than a dozen.
>>
>> On Mon, Feb 25, 2019 at 11:17 AM Gal Ben Haim 
>> wrote:
>>
>>> Eitan,
>>>
>>> How many times did you see this error?
>>>
>>> On Mon, Feb 25, 2019 at 9:49 AM Barak Korren  wrote:
>>>
>>>> That is not the real issue, the real issue seems to be this:
>>>>
>>>> + sudo -n systemctl start docker
>>>> Job for docker.service failed because the control process exited with
>>>> error code. See "systemctl status docker.service" and "journalctl -xe" for
>>>> details.
>>>> + sudo -n systemctl status docker
>>>> ● docker.service - Docker Application Container Engine
>>>>Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled;
>>>> vendor preset: disabled)
>>>>Active: activating (auto-restart) (Result: exit-code) since Mon
>>>> 2019-02-25 04:03:52 UTC; 45ms ago
>>>>  Docs: https://docs.docker.com
>>>>   Process: 15496 ExecStart=/usr/bin/dockerd -H fd:// (code=exited,
>>>> status=1/FAILURE)
>>>>  Main PID: 15496 (code=exited, status=1/FAILURE)
>>>> Feb 25 04:03:52 openshift-integ-tests-container-6bmr3 systemd[1]:
>>>> Failed to start Docker Application Container Engine.
>>>> Feb 25 04:03:52 openshift-integ-tests-container-6bmr3 systemd[1]: Unit
>>>> docker.service entered failed state.
>>>> Feb 25 04:03:52 openshift-integ-tests-container-6bmr3 systemd[1]:
>>>> docker.service failed.
>>>> + :
>>>> + log ERROR 'Failed to start docker service'
>>>> + local level=ERROR
>>>>
>>>>
>>>> So docker is failing to start in the integ-test container. Here is the
>>>> podspec that was used:
>>>>
>>>> ---apiVersion: v1kind: Podmetadata:  generateName: jenkins-slave  labels:  
>>>>   integ-tests-container: ""  namespace: jenkins-ovirt-orgspec:  
>>>> containers:- env:- name: JENKINS_AGENT_WORKDIR  value: 
>>>> /home/jenkins- name: CI_RUNTIME_UNAME  value: jenkins  
>>>>   - name: STDCI_SLAVE_CONTAINER_NAME  value: im_a_container
>>>> - name: CONTAINER_SLOTS  value: /var/lib/stdci  image: 
>>>> docker.io/ovirtinfra/el7-runner-node:12c9f471a6e9eccd6d5052c6c4964fff3b6670c9
>>>>   command: ['/usr/sbin/init']  livenessProbe:exec: 
>>>>  command: ['systemctl', 'status', 'multi-user.target']
>>>> initialDelaySeconds: 360periodSeconds: 7200  name: jnlp  
>>>> resources:limits:  memory: 32Girequests:  
>>>> memory: 32Gi  securityContext:privileged: true  
>>>> volumeMounts:- mountPath: /var/lib/stdci  name: 
>>>> slave-cache- mountPath: /dev/shm  name: dshm  
>>>> workingDir: /home/jenkins  tty: true  nodeSelector:model: r620  
>>>> serviceAccount: jenkins-slave  volumes:- hostPath:path: 
>>>> /var/lib/stdcitype: DirectoryOrCreate  name: slave-cache- 
>>>> emptyDir:medium: Memory  name: dshm
>>>>
>>>>
>>>> Adding Gal and infra list.
>>>>
>>>>
>>>> On Mon, 25 Feb 2019 at 08:45, Eitan Raviv  wrote:
>>>>
>>>>> Hi,
>>>>> I have some OST patches failing on:
>>>>>
>>>>> *04:03:53* Error: INVALID_SERVICE: ovirtlago
>>>>>
>>>>> e.g. 
>>>>> https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/3443/consoleFull
>>>>>
>>>>> I am fully rebased on ost master.
>>>>>
>>>>> Can you have a look?
>>>>>
>>>>> Thank you
>>>>>
>>>>>
>>>>> -- Forwarded message -
>>>>> From: Galit Rosenthal 
>>>>> Date: Mon, Feb 25, 2019 at 8:35 AM
>>>>> Subject: Re: INVALID_SERVICE: ovirtlago
>>>>> To: Eitan

Re: How to run manual job with standard STDCIV2 build results?

2019-02-27 Thread Barak Korren
%20system%20tests/job/ovirt-system-tests_manual/4159/
>>
>> [2]
>> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/4159/artifact/exported-artifacts/lago_logs/lago.log
>>
>> [3]
>> https://jenkins.ovirt.org/job/ovirt-imageio_standard-check-patch/1007/artifact/build-artifacts.el7.x86_64/
>> --
>> Didi
>> _______
>> Infra mailing list -- infra@ovirt.org
>> To unsubscribe send an email to infra-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/SGI4LEKEAHCLEYHCINGHALIXDCYJMM6B/
>>
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/infra@ovirt.org/message/6PUO7SQHHAKNSGHUBRP5JHH6ZUO2EB3N/
>


-- 
Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted
___
Infra mailing list -- infra@ovirt.org
To unsubscribe send an email to infra-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/infra@ovirt.org/message/2D5SINZWZCVPBT2WSTZBVR3ILC4LB6IL/


Re: How to run manual job with standard STDCIV2 build results?

2019-02-27 Thread Barak Korren
On Thu, 28 Feb 2019 at 08:40, Yedidyah Bar David  wrote:

> On Wed, Feb 27, 2019 at 2:22 PM Barak Korren  wrote:
> >
> >
> >
> > On Wed, 27 Feb 2019 at 13:00, Dafna Ron  wrote:
> >>
> >> Hi Didi.
> >>
> >> We blocked the ability to use external repos a while ago as they were
> causing a lot of failures in CQ.
> >>
> >> Please add the repos you need to the .repos v2 file and then re-run.
> >>
> >> Thanks,
> >> Dafna
> >>
> >>
> >> On Wed, Feb 27, 2019 at 10:43 AM Yedidyah Bar David 
> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> I run [1] and it fails with [2]:
> >>>
> >>> 2019-02-27
> 08:27:00,821::log_utils.py::__enter__::600::ovirtlago.reposetup::INFO::
> >>>  # Running repoman:  [0m [0m
> >>> 2019-02-27
> 08:27:00,822::log_utils.py::__enter__::600::lago.utils::DEBUG::start
> >>> task:87982007-be04-4d4b-af19-751f85f8df25:Run command: "repoman"
> >>> "--option=main.on_empty_source=warn"
> >>> "--option=store.RPMStore.on_wrong_distro=copy_to_all"
> >>> "--option=store.RPMStore.with_srcrpms=false"
> >>> "--option=store.RPMStore.with_sources=false"
> >>> "--option=store.RPMStore.rpm_dir="
> >>>
> "/dev/shm/ost/deployment-upgrade-from-release-suite-master/default/internal_repo/default"
> >>> "add"
> "conf:/home/jenkins/workspace/ovirt-system-tests_manual/ovirt-system-tests/upgrade-from-release-suite-master/extra_sources"
> >>> "/var/lib/lago/ovirt-master-tested-el7:only-missing"
> >>> "/var/lib/lago/ovirt-master-snapshot-static-el7:only-missing"
> >>> "/var/lib/lago/glusterfs-5-el7:only-missing"
> >>> "/var/lib/lago/centos-updates-el7:only-missing"
> >>> "/var/lib/lago/centos-base-el7:only-missing"
> >>> "/var/lib/lago/centos-extras-el7:only-missing"
> >>> "/var/lib/lago/epel-el7:only-missing"
> >>> "/var/lib/lago/centos-ovirt-4.3-el7:only-missing"
> >>> "/var/lib/lago/centos-ovirt-common-el7:only-missing"
> >>> "/var/lib/lago/centos-qemu-ev-testing-el7:only-missing"
> >>> "/var/lib/lago/centos-opstools-testing-el7:only-missing"
> >>> "/var/lib/lago/centos-sclo-rh-release-el7:only-missing":
> >>> 2019-02-27
> 08:28:08,161::utils.py::_run_command::192::lago.utils::DEBUG::87982007-be04-4d4b-af19-751f85f8df25:
> >>> command exit with return code: 1
> >>> 2019-02-27
> 08:28:08,161::utils.py::_run_command::197::lago.utils::DEBUG::87982007-be04-4d4b-af19-751f85f8df25:
> >>> command stderr: 2019-02-27 08:27:01,014::INFO::repoman.cmd::
> >>> 2019-02-27 08:27:01,014::INFO::repoman.cmd::Adding artifacts to the
> >>> repo
> /dev/shm/ost/deployment-upgrade-from-release-suite-master/default/internal_repo/default
> >>> 2019-02-27 08:27:01,015::INFO::repoman.common.stores.RPM::Loading repo
> >>>
> /dev/shm/ost/deployment-upgrade-from-release-suite-master/default/internal_repo/default
> >>> 2019-02-27 08:27:01,015::INFO::repoman.common.stores.RPM::Repo
> >>>
> /dev/shm/ost/deployment-upgrade-from-release-suite-master/default/internal_repo/default
> >>> loaded
> >>> 2019-02-27 08:27:01,015::INFO::repoman.common.stores.iso::Loading repo
> >>>
> /dev/shm/ost/deployment-upgrade-from-release-suite-master/default/internal_repo/default
> >>> 2019-02-27 08:27:01,016::INFO::repoman.common.stores.iso::Repo
> >>>
> /dev/shm/ost/deployment-upgrade-from-release-suite-master/default/internal_repo/default
> >>> loaded
> >>> 2019-02-27 08:27:01,020::INFO::repoman.common.repo::Resolving artifact
> >>> source rec:
> https://jenkins.ovirt.org/job/ovirt-imageio_standard-check-patch/1007/artifact/build-artifacts.el7.x86_64/
> >>> 2019-02-27 08:27:01,022::INFO::repoman.common.sources.url::Recursively
> >>> fetching URL (level 0):
> >>>
> https://jenkins.ovirt.org/job/ovirt-imageio_standard-check-patch/1007/artifact/build-artifacts.el7.x86_64/
> >>> 2019-02-27 08:28:08,127::ERROR::repoman.cmd::maximum recursion depth
> >>> exceeded in cmp
> >>>
> >>> Perhaps this is a bug, or a result of supplying in CUSTOM_REPOS [3]:
> >>>
> >>> rec:
> https://jenkins.ovirt.org/job/ovirt-imageio_standard-check-patch/1007/artifac

Re: otopi: release branches and change queues

2019-03-11 Thread Barak Korren
On Mon, 11 Mar 2019 at 08:43, Yedidyah Bar David  wrote:

> Hi all,
>
> I branched yesterday otopi-1.8 to be used for ovirt-4.3, and forgot to
> patch stdci accordingly. Now pushed the patch [1] (to both master and
> otopi-1.8 - I realize I do not need it in master, but hopefully this
> will help me remember patching it in 4.4).


The release branches syntax was designed to allow you to do that.


> What can/should I do to
> include the otopi-1.8.1 build [2] in the 4.3 change queue?
>

Either:

Merge this patch -> https://gerrit.ovirt.org/98408
And then merge a build patch to the 1.8 branch

Or:

Merge a patch making the master branch submit to the 4.3 CQ as well on to
of the build patch and then revert it.

Generally the CQ is a CI construct - it does not care about "build"
patches, from its POV all patches are equal and it just cares about the
lates patch in any given relevant branch.

We've suggested to make an "official build" CQ in the past that would only
take in tagged patches and construct the "official" releases, there id not
seem to be any eager adoption of this idea.

And, related to this, some ideas about how to make stdci more useful
> for mere developers:
>
> 1. Make the change-queue post comments in gerrit patches once it has a
> result (patch passed change queue X, failed, something else?). I (as
> always) prefer a bit more information (e.g. a few more comments that
> might not be that interesting) than not enough (current). We can
> always refine later. IIRC this was already discussed in the past - any
> update?
>

Here is the old ticket about this:
https://ovirt-jira.atlassian.net/browse/OVIRT-1447

This idea is about as old as CQ itself...


> 2. Make the current comment "This change was successfully submitted to
> the change queue(s) for system testing." more informative - include
> names of change queues and links to relevant pages (builds? not sure,
> I do not know enough CI).
>

That message is sent from the V1 "standard-enqueu" job, and is irrelevant
for V2 projects. It is only sent BECAUSE there is a separate job that can
fail separately.

In V2 the CQ submission is done by the same "*-on-merge" job that also runs
"check-merged" and build the artifacts if needed. The CQ submission step
could be seen in the blue ocean screen for the job.

The CQ core code itself has quite an elaborate status notification hooking
mechanism that can be made to send a rather detailed information about the
processing stages the patch goes through in the CQ, but to make that send
notifications, someone would need to write some code. There is some
shortage of developers that are familiar with the CQ code ATM.


> 3. As a side note, I realize that the option names were made
> case-insensitive and ignore whitespace/dashes/underscores in order to
> not impose a certain style on the many different projects/maintainers
> and let them use what they find best for their project. I personally
> think this was a mistake. If you have to make a choice when naming
> something, and have a feeling that your opinion might cause
> noise/objections/etc. in the future, make a quick poll asking what
> people prefer, and then pick a single name. Having several (many)
> different names for things makes it much harder later on to _find_
> them. It might be too late now to change, because people might already
> picked different names in practice and we do not want to break their
> stuff, and that's a pity. But next time, be opinionated, or ask
> beforehand - not keep the choice. Most computer languages and
> libraries have single unique names for things, for a reason.
>

My view on the subject is different. Almost all languages have some styling
leeway and leave some decisions to project developers or code linting tool
authors. Many heavily used languages are case insensitive for example.
Notable examples include SQL and HTML. In the realm of configuration files
this quite seems to be the norm. In short, I think styling should be
enforced by an external tool and not the core parser or compiler.

I think having stuff "just work" instead of failing transparently because
you wrote "-" instead of "_" is more important then being grep friendly. In
this case writing the configuration is more common them grepping for it
IMO. And also making an insensitive version of grep, or otherwise, a tool
that would quickly convert a file to a canonical version should be easy
enough.


> [1] https://gerrit.ovirt.org/#/q/I794426ec9212e0a023c3e5f158d0a88fc8e6842c
>
> [2] https://gerrit.ovirt.org/98408
> --
> Didi
> ___
> Infra mailing list -- infra@ovirt.org
> To unsubscribe send an email to infra-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/

Moving the wiki

2014-10-20 Thread Barak Korren
There has been some talk recently about moving the ovirt.org wiki to a 
dedicated VM in the PHX data-center.
I would like to open this up to some discussion.
Here is the current situation as far as I could gather, more info and comment 
are welcome.

What do we have now
---
MediaWiki (Which version? eedri told me its a rather old one)
PHP (Which version?)
MySQL 5.1
All running in a single (?) 'large' OpenShift gear.
Our OpenShift account is classified as 'silver' (is it?) thereby granting us 
gears with 6GB storage instead 
of 1GB) 

Why do we want to migrate?
--
We occasionally have a problem where the site goes down. This seems to be 
caused by one of:
1. The OpenShift gear runs out of space
2. The MySQL DB gets stuck with no errors in the logs (Does restarting it 
resolve it?)

Why not to migrate?
---
1. Migrating the wiki to PHX VM will make the infra team have to manage the 
wiki hosting infrastructure. 
   While one may claim that this is not complicated and that this work needs to 
also be done when the wiki 
   is hosted on OpenShift, there are still many things that the OpenShift 
maintainers do for us such as:
   - Keeping the webservers updated
   - Managing selinux
   - Enablign automatic scale-up

2. There are security concerns with having a public-facing (outdated?) PHP 
application running on a VM in 
   the same network where our build and CI servers run. (I might be too 
paranoid here, having had one of my
   own sites defaced recently, but OpenShift makes it easy to create a new gear 
and git push the code to 
   get back up and running, with our own VM, forensics and cleanup might be 
more complicated) 

Known infra issues with existing configuration
--
1. The MySQL DB was setup without 'innodb_file_per_table' turned on, this can 
impact DB performance. To 
   resolve this, one need to dump and import the entire DB.

Things we can try (Besides migrating)
-
1. Place ovirt.org behind a caching reverse-proxy CDN like cloudflare, that can 
mask some of our downtime.
2. Dump and import the DB to rebuild and optimize the DB files
3. Rebuild the wiki in a new gear to get rid of possibly accumulating cruft
4. Upgrade the MySQL to 5.5 (Or whatever latest supported by OpenShift)
5. Upgrade MediaWiki
6. Add a redundant MySQL/Wiki server using MySQL replication
7. Trim the wiki history (AFAIK MediaWiki saves every edit ever, but one can 
maybe use export/import to get
   rid of some)  
8. Try to come up with a way to spread the Wiki across multiple OpenShift gears
9. Tune DB parameters (is it possible to do on OpenShift?)

I eagerly await your comments,
Regards,
Barak.

___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Ansible work and organization

2016-08-29 Thread Barak Korren
t; discussion too, and integration with your CI. The current setup is in my
> repo.

I tried to follow guidelines here:
http://docs.ansible.com/ansible/playbooks_best_practices.html

Other suggestions are welcome.

WRT Vagrant - I know this is a popular tool to use for Ansible role
testing, but I'd prefer that if and when we add a CI flow to our
Ansible repo, it would be based on our existing CI standards and Lago.

So far I did spend any effort in this direction though, because I had
the luxury of an unused "production" VM to play with...

>   - how some resources (Ansible roles…) are to be shared or not:
> the OSAS team is very small, shares its time and energy among
> several projects, you have several people with a good Puppet background
> but less so on Ansible, so I would advise sharing workload. The OSAS
> team can act as a neutral third party and insure all needed features are
> taken into account, reviews are done in a timely manner, for example. It
> was just a logical move from us to allow us spending less time on these
> tasks¸ but this is open to discussion (as stated on my previous messages)

I think I already answered this above, if OSAS would provide usable
stand-alone roles, we will be happy to consume them and contribute as
needed. We've used this kind of process with Puppet modules in the
past, and I'm also trying to follow it with the 3rd-party modules I
already use.

> As we say here: yoroshiku onegai shimasu.

(Trying to practice my very broken Japanese :)

Shinpaishinaide kudasai.
Watashi o yurushitekudasai


-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: ovirt-provider-ovn - new gerrit.ovirt.org project

2016-08-29 Thread Barak Korren
> On Mon, Aug 29, 2016 at 1:11 PM, Marcin Mirecki  wrote:
>>
>> All,
>>
>> Can you please add a new project to gerrit.ovirt.org?
>> The project name: ovirt-provider-ovn
>>
>> The project will contain the ovn external network provider.
>> I will maintain it.
>>

Project created.
https://gerrit.ovirt.org/#/admin/projects/ovirt-provider-ovn



-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Vdsm CI on 3.6 missing vmconsole package

2016-08-31 Thread Barak Korren
Where should this package come from?

I'm not seeing it in any of the repos that are configured for the mock
environment, unless its supposed to come from the base fedora or
fedora updates repos.

If its a caching issues we should be seeing error messages indicating
failure to download repo metadata. I did not find any...

On 30 August 2016 at 21:31, Nir Soffer  wrote:
> Hi all,
>
> I got this error building patch for 3.6:
>
> 18:24:28 Last metadata expiration check: 0:00:10 ago on Tue Aug 30
> 18:24:17 2016.
> 18:24:33 Error: nothing provides ovirt-vmconsole >= 1.0.0-0 needed by
> vdsm-4.17.34-7.git014860f.fc23.noarch.
>
> Looks like bad repositories or caching issues.
>
> See http://jenkins.ovirt.org/job/vdsm_3.6_check-patch-fc23-x86_64/48/console
>
> Nir
> ___
> Infra mailing list
> Infra@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra



-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Vdsm CI on 3.6 missing vmconsole package

2016-08-31 Thread Barak Korren
You see to be using the master snapshot repos  and are on fc24. This
is a 3.6 build on fc23...

The automation/check-patch.repos.fc23 file in the ovirt-3.6 branch
currently only includes the 3.6 snapshot repos (As it probably
should).

So you probably need to make sure 'ovirt-vmconsole' finds its way to thos repos.

On 31 August 2016 at 10:27, Francesco Romani  wrote:
> - Original Message -----
>> From: "Barak Korren" 
>> To: "Nir Soffer" 
>> Cc: "infra" , "Francesco Romani" 
>> Sent: Wednesday, August 31, 2016 9:23:14 AM
>> Subject: Re: Vdsm CI on 3.6 missing vmconsole package
>>
>> Where should this package come from?
>
> From our own ovirt repos. On my laptop, for example:
>
> Available Packages
> Name: ovirt-vmconsole
> Arch: src
> Epoch   : 0
> Version : 1.0.4
> Release : 0.0.master.20160805075548.git6c59386.fc24
> Size: 254 k
> Repo: ovirt-master-snapshot
> Summary : oVirt VM console
> URL : http://www.ovirt.org
> License : GPLv3
> Description : oVirt VM console proxy
>
> I'm on fedora 24.
>
>
> --
> Francesco Romani
> RedHat Engineering Virtualization R & D
> Phone: 8261328
> IRC: fromani



-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Vdsm CI on 3.6 missing vmconsole package

2016-08-31 Thread Barak Korren
On 31 August 2016 at 10:44, Francesco Romani  wrote:
> - Original Message -
>> From: "Barak Korren" 
>> To: "Francesco Romani" 
>> Cc: "Nir Soffer" , "infra" 
>> Sent: Wednesday, August 31, 2016 9:34:51 AM
>> Subject: Re: Vdsm CI on 3.6 missing vmconsole package
>>
>> You see to be using the master snapshot repos  and are on fc24. This
>> is a 3.6 build on fc23...
>>
>> The automation/check-patch.repos.fc23 file in the ovirt-3.6 branch
>> currently only includes the 3.6 snapshot repos (As it probably
>> should).
>
> Right: in 3.6 repo we ovirt-vmconsole have packages for fc22, while on the 
> 4.0 repo
> we have the packages for fc24.
>
> How can I upload them?
>

The nightly publisher job should be doing that for you, but it seems
to only publish 3.6 for el6 and el7 ATM. That is probably a bug.

In more detail, it looks like somebody made the
'copy-create-job-artifact-all-platforms' YAML macro just be an alias
to 'copy-create-job-artifact-engine-platforms' instead actually
including all platforms...

@sbonazzo, any idea why is it setup like this ATM?

-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Jenkins permissions

2016-09-01 Thread Barak Korren
What do you need?


On 1 September 2016 at 15:20, Martin Sivak  wrote:
> Hi,
>
> can I please get the necessary permissions to be able to work as a
> developer and mom and hosted engine ha maintainer?
>
> Thanks
>
> Martin Sivak
> ___
> Infra mailing list
> Infra@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra



-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Ansible best practices

2016-09-04 Thread Barak Korren
On 4 September 2016 at 13:04, Eyal Edri  wrote:
> Since we started writing some Ansible in the team lately, might be relevant:
>
> https://www.ansible.com/blog/ansible-best-practices-essentials
>
See my discussion with Duck, I tried to follow that where possible.

-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [JIRA] (OVIRT-696) create an 'infra-ansible' repo

2016-09-06 Thread Barak Korren
On 6 September 2016 at 10:21, Marc Dequènes (Duck)  wrote:
>
> So cleanup done. In fact the commit hook added them when doing the git
> am :-/.
>
> So, when working on a topic branch, people usually intend to work on the
> same feature, so to stakup improvements and fixes on the same
> "patchset". I fail to see how to do that with git review and the current
> commit hook, as it does not seem to care about handling a branch as a
> series of patches in the same patchset. Are we supposed to handle the
> Change-ID manually?
>
> Please help on this matter, the conflicting docs on Internet did not
> really help.
>

In Gerrit every commit becomes a patch, there is no way to put
multiple commits in the same patch (this is probably the most notable
difference between Gerrit and pull-request based systems).
The idea in Gerrit is that once you are done developing, you squash
your commits together to generate patches that will be easy for
reviewers to understand.


-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [JIRA] (OVIRT-696) create an 'infra-ansible' repo

2016-09-06 Thread Barak Korren
> I'm not really sure to understand what would happen if I push to review
> several commit with the same Change-Id; would it create a patchset or
> replace the former commits?

Patch sets within the same change ID replace each other with the most
recent one is the one being reviewed and potentially merged.

> Also, even if I would probably not send as big changes as now again, I
> see kernel devs sending features is a series of patches for more
> readability, so Gerrit is unable to do that?

It is. You just give each commit a different change id and push it as
a different patch. Gerrit knows how to track dependencies between
patches and present then in the UI.

git-review helps here - Given branch 'foo' with N commits on top of
'maser', checking out on 'foo' and running 'git review' would create a
gerrit patch and a change-id for each commit and would send them to
Gerrit while also setting the topic in gerrit to 'foo' to allow
looking at the patches as a group.

Please note that each commit is reviewed separately, a change
requested by a reviewer for an earlier commit may necessitate
adaptations for later patches. When submitting sets of patches one
must be ready to perform the occasional 'rebase -i'...

-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [JIRA] (OVIRT-696) create an 'infra-ansible' repo

2016-09-06 Thread Barak Korren
On 6 September 2016 at 15:21, Marc Dequènes (Duck)  wrote:
>
> On 09/06/2016 09:15 PM, Marc Dequènes (Duck) wrote:
>
>> That is not very practical in this case. I'd like to preserve the work
>> history, as said in the meeting.
>
> I mean in most cases squashing is perfectly fine, unless it is very big,
> so I agree with this method. I'm just looking for a way to get the
> previous work history, so this is a one-shot situation.
>
> As the previous way of pushing changes was different, there is no other
> way to also push the logic motivating the changes. After, reviews would
> take over.

I find that typically work history can be lease then ideal for
presenting work to other developers. For me I typically find that I
can narrow down commits by a factor of 3-5 when going from actual work
history to a commit series that describes gradual accumulation of
major features.

Please not that I did not mean that you should squash all commits to a
single big one, just narrow them down and reorder to make it easier
for us to understand your major themes and ideas.

-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [VDSM] build-artifacts failing on master

2016-09-15 Thread Barak Korren
On 14 September 2016 at 22:31, Eyal Edri  wrote:
> Its actually a good question to know if standard CI supports versions of
> RPMs.
> Barak - do you know if we can specify in build-artifacts.packages file a
> version requirement?
>
> for e.g python-nose >= 1.3.7
>

You can only use the syntax supported by yum.
Since yum always installs latest by default, a '>=' syntax would not
be useful with it.


-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [ovirt-devel] [VDSM] build-artifacts failing on master

2016-09-15 Thread Barak Korren
>
> I love running tests on the build systems - its gives another layer of
> assurance that we are going to build a good package for the relevant
> system/architecture.
>
> However, the offending patch makes it impossible on el7-based build
> system. Can we instead skip the test (on such systems) if the right nose
> version is not installed?
>
> We should file a bug to fix nose on el7.
>

IMO test requirements != build req != runtime req.

It is perfectly valid to use virtualenv and pip to enable using the
latest and graetest testing tools, but those should __only__ be used
in a testing environment. Those should not be used in a build
environment which is designed to be reproducible and hence is
typically devoid of network access.
Deploy requirements should be tested, but by their nature those tests
need to run post-build and hence are better left to integration tests
like ovirt-system-tests.

-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [ovirt-devel] [VDSM] build-artifacts failing on master

2016-09-15 Thread Barak Korren
>
> I'm not sure I understand your point. RPM spec files have %check section
> for pre-build tests. Should we, or should we not, strive to use them?

AFAIK the %check section is only relevant for DS builds, as in US we
have many other places to run tests from (e.g check_patch.sh).

AFAIK by its nature %check only allows for source-level tests (since
you don't have an RPM yet). Source-level tests can and should run
before a patch is merged. The DS build, however, happens long after
the patch is merged, and by the time it does, all source-level issues
should have been checked for.
Therefore, I think making tests run in %check should not be a high priority.

-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: fabric jobs in ci

2016-09-26 Thread Barak Korren
This should fix it: https://gerrit.ovirt.org/#/c/64474/1

On 26 September 2016 at 10:20, Eyal Edri  wrote:
> Fabric has 2 jobs that are failing [1] for over a month, if no one has
> objection i'll disable them for now since they are just adding noise to the
> unstable jobs which should be 0 on normal state.
>
> I know we have some work in progress for fabric so it might not be ready,
> so I suggest to add these jobs to a jira ticket that is tracking progress on
> this task
> and the owner will enable them once the fabric project will be stable.
>
>
> [1]
> http://jenkins.ovirt.org/job/fabric-ovirt_master_check-merged-el7-x86_64/
>
> --
> Eyal Edri
> Associate Manager
> RHV DevOps
> EMEA ENG Virtualization R&D
> Red Hat Israel
>
> phone: +972-9-7692018
> irc: eedri (on #tlv #rhev-dev #rhev-integ)
>
> ___
> Infra mailing list
> Infra@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra
>



-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [VDSM] All tests using directio fail on CI

2016-09-28 Thread Barak Korren
The CI setup did not change recently.

All standard-CI jobs run inside mock (chroot) which is stored on top
of a regular FS, so they should not be affected by the slave OS at all
as far as FS settings go.

But perhaps some slave-OS/mock-OS combination is acting strangely, so
could you be more specific and point to particular job runs that fail?

On 28 September 2016 at 22:00, Nir Soffer  wrote:
> Hi all,
>
> It seems that the CI setup has changed, and /var/tmp is using now tempfs.
>
> This is not compatible with vdsm tests, assuming that /var/tmp is a real file
> system. This is the reason we do not use /tmp.
>
> We have lot of storage tests using directio, and directio cannot work on
> tempfs.
>
> Please check the slaves and make sure /var/tmp is using file system supporting
> directio.
>
> See example failure bellow.
>
> Nir
>
> 
>
> 12:33:20 
> ==
> 12:33:20 ERROR: test_create_fail_creating_lease
> (storage_volume_artifacts_test.BlockVolumeArtifactsTests)
> 12:33:20 
> --
> 12:33:20 Traceback (most recent call last):
> 12:33:20   File
> "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/tests/storage_volume_artifacts_test.py",
> line 485, in test_create_fail_creating_lease
> 12:33:20 *BASE_PARAMS[sc.RAW_FORMAT])
> 12:33:20   File "/usr/lib64/python2.7/unittest/case.py", line 513, in
> assertRaises
> 12:33:20 callableObj(*args, **kwargs)
> 12:33:20   File
> "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/vdsm/storage/sdm/volume_artifacts.py",
> line 391, in create
> 12:33:20 desc, parent)
> 12:33:20   File
> "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/vdsm/storage/sdm/volume_artifacts.py",
> line 482, in _create_metadata
> 12:33:20 sc.LEGAL_VOL)
> 12:33:20   File
> "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/vdsm/storage/volume.py",
> line 427, in newMetadata
> 12:33:20 cls.createMetadata(metaId, meta.legacy_info())
> 12:33:20   File
> "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/vdsm/storage/volume.py",
> line 420, in createMetadata
> 12:33:20 cls._putMetadata(metaId, meta)
> 12:33:20   File
> "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/vdsm/storage/blockVolume.py",
> line 242, in _putMetadata
> 12:33:20 f.write(data)
> 12:33:20   File
> "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/lib/vdsm/storage/directio.py",
> line 161, in write
> 12:33:20 raise OSError(err, msg)
> 12:33:20 OSError: [Errno 22] Invalid argument
> 12:33:20  >> begin captured logging << 
> 
> 12:33:20 2016-09-28 05:32:03,254 DEBUG   [storage.PersistentDict]
> (MainThread) Created a persistent dict with VGTagMetadataRW backend
> 12:33:20 2016-09-28 05:32:03,255 DEBUG   [storage.PersistentDict]
> (MainThread) read lines (VGTagMetadataRW)=[]
> 12:33:20 2016-09-28 05:32:03,255 DEBUG   [storage.PersistentDict]
> (MainThread) Empty metadata
> 12:33:20 2016-09-28 05:32:03,255 DEBUG   [storage.PersistentDict]
> (MainThread) Starting transaction
> 12:33:20 2016-09-28 05:32:03,256 DEBUG   [storage.PersistentDict]
> (MainThread) Flushing changes
> 12:33:20 2016-09-28 05:32:03,256 DEBUG   [storage.PersistentDict]
> (MainThread) about to write lines (VGTagMetadataRW)=['CLASS=Data',
> 'POOL_UUID=52fe3782-ed7a-4d84-be3d-236faebdca2d',
> 'SDUUID=c088020e-91b2-45e6-85fd-eea3fce58764', 'VERSION=3',
> '_SHA_CKSUM=3f26f105271d3f7af6ea7458fb179418e7f9c139']
> 12:33:20 2016-09-28 05:32:03,257 DEBUG
> [storage.Metadata.VGTagMetadataRW] (MainThread) Updating metadata
> adding=MDT_POOL_UUID=52fe3782-ed7a-4d84-be3d-236faebdca2d,
> MDT__SHA_CKSUM=3f26f105271d3f7af6ea7458fb179418e7f9c139,
> MDT_SDUUID=c088020e-91b2-45e6-85fd-eea3fce58764, MDT_CLASS=Data,
> MDT_VERSION=3 removing=
> 12:33:20 2016-09-28 05:32:03,257 DEBUG   [storage.PersistentDict]
> (MainThread) Finished transaction
> 12:33:20 2016-09-28 05:32:03,261 INFO
> [storage.BlockVolumeArtifacts] (MainThread) Create placeholder
> /var/tmp/tmp3UKYqC/52fe3782-ed7a-4d84-be3d-236faebdca2d/c088020e-91b2-45e6-85fd-eea3fce58764/images/a1e3aa74-2304-4e0c-b8e8-f5b86f8d3ac7
> for image's volumes
> 12:33:20 2016-09-28 05:32:03,264 WARNING
> [storage.StorageDomainManifest] (MainThread) Could not find mapping
> for lv 
> c088020e-91b2-45e6-85fd-eea3fce58764/3267d54d-c0fa-4fe1-b82b-b88dd5f90de3
> 12:33:20 2016-09-28 05:32:03,264 DEBUG
> [storage.StorageDomainMan

Re: [ovirt-devel] [VDSM] All tests using directio fail on CI

2016-09-29 Thread Barak Korren
On 29 September 2016 at 11:08, Yaniv Kaul  wrote:
> zram does not support direct IO (tested, indeed fails).
> What I do is host the VMs there, though - this is working - but I'm using
> Lago (and not oVirt). does oVirt need direct IO for the temp disks? I
> thought we are doing them on the libvirt level?

Its not oVirt that needs those, its some specific VDSM unit tests.


-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: building engine artificats from a posted patch?

2016-11-01 Thread Barak Korren
> Also over time I lost privilege to re-trigger failed jenkins jobs, if this is 
> something different, can you re-grant me privileges to do that?
>

Hi Martin,

I granted you the 'edv' role on Jenkins. It should provide you wiht
what you need.

-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: vdsm_master_check-patch fails consistently before starting the build

2016-11-03 Thread Barak Korren
We had some issues with resources.ovirt.org last night.

Most recent run [1] seems to fail one of the tests:

==
ERROR: test_get_info_generation_id(100, 100)
(storage_volume_test.VolumeManifestTest)
--
Traceback (most recent call last):
  File 
"/home/jenkins/workspace/vdsm_master_check-patch-fc24-x86_64/vdsm/tests/testlib.py",
line 133, in wrapper
return f(self, *args)
  File 
"/home/jenkins/workspace/vdsm_master_check-patch-fc24-x86_64/vdsm/tests/storage_volume_test.py",
line 152, in test_get_info_generation_id
self.assertEqual(info_gen, vol.getInfo()['generation'])
  File 
"/home/jenkins/workspace/vdsm_master_check-patch-fc24-x86_64/vdsm/vdsm/storage/volume.py",
line 239, in getInfo
info['qcow2compat'] = qemu_info["compat"]
KeyError: 'compat'


[1]: 
http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc24-x86_64/3621/console

On 3 November 2016 at 00:39, Nir Soffer  wrote:
> Hi all,
>
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc24-x86_64/
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/
>
> Seems to fail here:
>
> 22:17:59 Start: yum install
> 22:28:10 ERROR: Command failed. See logs for output.
>
> tests are fine on travis (except some know failures).
> https://travis-ci.org/oVirt/vdsm/builds/172772499
>
> tests are fine locally on fedora 24.
>
> Please check.
>
> Thanks,
> Nir
> _______
> Infra mailing list
> Infra@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra



-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: ovirt-provider-ovn - new gerrit.ovirt.org project

2016-11-17 Thread Barak Korren
> Thank you, Barrak.
>
> I don't know if it has ever worked, but currently I get
>
> $ git clone git://gerrit.ovirt.org/ovirt-provider-ovn.git
> Cloning into 'ovirt-provider-ovn'...
> fatal: Could not read from remote repository.
>
> Please make sure you have the correct access rights
> and the repository exists.
>
> could you (or someone) look into it?

Please open a new Jira ticket (just mail with a new subject to
infra-supp...@ovirt.org)

In the meantime you can use the git+ssh url.


-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Change in jenkins[master]: master_upgrade_from_master: Upgrade to self instead of snapshot

2016-11-24 Thread Barak Korren
On 24 November 2016 at 14:34, Yedidyah Bar David  wrote:
> On Thu, Nov 24, 2016 at 1:13 PM, Code Review  wrote:
>> From Jenkins CI:
>>
>> Jenkins CI has posted comments on this change.
>>
>> Change subject: master_upgrade_from_master: Upgrade to self instead of 
>> snapshot
>> ..
>>
>>
>> Patch Set 1: Continuous-Integration-1
>>
>> Build Failed
>>
>> http://jenkins.ovirt.org/job/jenkins_master_check-patch-fc25-x86_64/5/ : 
>> FAILURE
>
> 11:08:53 Unable to find mock env fc25.*x86_64 use one of
> el6:epel-6-x86_64 el7:epel-7-x86_64 el7:epel-7-ppc64le
> fc23:fedora-23-x86_64 fc24:fedora-24-x86_64 fc24:fedora-24-ppc64le
> 11:08:53 Build step 'Execute shell' marked build as failure
>
> Any idea?
>
Fc25 mock is still pending:
https://gerrit.ovirt.org/#/c/66809/

Not sure why engine is trying to use it ATM.


-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [ovirt-devel] Gerrit headers are not added to commits in vdsm repo

2016-11-27 Thread Barak Korren
On 25 November 2016 at 16:57, Nir Soffer  wrote:
> On Fri, Nov 25, 2016 at 4:45 PM, Tomáš Golembiovský  
> wrote:
>> Hi,
>>
>> I've noticed that in vdsm repo the merged commits do not contain the
>> info headers added by Gerrit any more (Reviewed-by/Reviewed-on/etc.).
>>
>> Is that intentional? If yes, what was the motivation behind this?
>>
>> The change seem to have happened about 4 days ago. Sometime between the
>> following two commits:
>>
>> * 505f5da  API: Introduce getQemuImageInfo API. [Maor Lipchuk]
>> * 1c4a39c  protocoldetector: Avoid unneeded getpeername() [Nir Soffer]
>
> We switched vdsm to fast-forward 4 days ago, maybe this was unintended
> side effect of this change?
>
> The gerrit headers are very useful, please add back.
>

Headers cannot be added in fast-forward mode b/c then you end up with
a new commit hash - not fast forwarding to an existing commit.

In other words - headers are only added when Gerrit creates a new
commit - which in FF mode it never does.

I could move the project to "Rebase Always" which is like "Rebase if
Necessary" but always creates a new commit (with headers). Please note
that this is less strict then ff-only and therefore would lead to
merged combinations that do not get tested in CI.

-- 
Barak Korren
bkor...@redhat.com
RHEV-CI Team
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [oVirt Jenkins] test-repo_ovirt_experimental_master - Build #3801 - SUCCESS!

2016-11-30 Thread Barak Korren
On 1 December 2016 at 09:26, Eyal Edri  wrote:
> Will this error get solved also by the patch for replacing the proxies?
> Or we need to mirror epel to oVirt to avoid such errors?
>
> 05:28:44 and following error: Error setting up repositories: failure:
> repodata/486c936a72b1d31db8b5892cb0c0372ba3c171509f168c1c24b5e32d5bf11861-primary.sqlite.xz
> from ovirt-master-epel-el7: [Errno 256] No more mirrors to try.
> 05:28:44
> http://download.fedoraproject.org/pub/epel/7/x86_64/repodata/486c936a72b1d31db8b5892cb0c0372ba3c171509f168c1c24b5e32d5bf11861-primary.sqlite.xz:
> [Errno 14] HTTPS Error 404 - Not Found

Looks like the sqlite index file got replaced while our test is running.
We'll probably have to mirror to be resilient to this (Proxy cannot
help you with something it did not proxy yet).
One thing to note is that simple rsync mirror will not be enough, we
will need a mirror system that will make _atomic_ updates to the
mirror. Rsync will just make it behave like DS globalsync behaves.


-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [oVirt Jenkins] test-repo_ovirt_experimental_master - Build #3801 - SUCCESS!

2016-12-01 Thread Barak Korren
On 1 December 2016 at 09:43, Anton Marchukov  wrote:
> Hello All.
>
> Let's try not to over-complicate this. The error here we should care about
> is "no more mirrors to try" Just use more than one mirror in yum
> configuration in ovrit system tests. That's what we enabled in standard ci
> and it works fine there (although we are not running this new config for a
> long).

This is IMO miss-diagnosing the issue - the problem is not a failed
mirror - the problem is 404 on getting a metadata file from a mirror
that was updated because you have a stale repomd.xml file on your
local cache. Another mirror will not help there because it would
probably be updated as well.

You could also solve it by running 'yum clean' all the time but that
would severely slow things down.

The best solution is IMO to have our own "stable" mirror that _never_
changes while jobs are running.

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [oVirt Jenkins] test-repo_ovirt_experimental_master - Build #3801 - SUCCESS!

2016-12-01 Thread Barak Korren
On 1 December 2016 at 10:14, Anton Marchukov  wrote:
> Also an interesting discussion to have is the same we had for maven cache
> files.
>
> We use reposync in lago that is supposed to sync repos locally and then it
> can update the existing cache to match the remote mirror when it is invoked,
> then as I understand we get its cache deleted each time because we use mock.

You understand wrong. The cache dir (/var/lib/lago) is bind-mounted
into mock, so it persists across runs.

> Now I am not sure do we really need mock in ovirt system tests? As I
> understand it uses lago and lago runs everything inside vms so it is somehow
> isolated already?

You still need an isolated environment to run Lago itself and its
dependencies. Also the test code itself is not running in a VM.


-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Rebase over other author's patch cannot be pushed to gerrit

2016-12-01 Thread Barak Korren
Forwarding to infra-support.

On 1 December 2016 at 18:03, Yaniv Bronheim  wrote:
> Not sure since when it was changed, but I noticed that I can't push patches
> if I'm not the author
>
> Counting objects: 50, done.
> Delta compression using up to 4 threads.
> Compressing objects: 100% (50/50), done.
> Writing objects: 100% (50/50), 36.66 KiB | 0 bytes/s, done.
> Total 50 (delta 34), reused 0 (delta 0)
> remote: Resolving deltas: 100% (34/34)
> remote: Processing changes: refs: 1, done
> remote:
> remote: ERROR:  In commit db14ec7c1555a9eb37a0fb931bbb4ebdfc674bb4
> remote: ERROR:  author email address rnach...@redhat.com
> remote: ERROR:  does not match your user account.
> remote: ERROR:
> remote: ERROR:  The following addresses are currently registered:
> remote: ERROR:bronh...@gmail.com
> remote: ERROR:ybron...@redhat.com
> remote: ERROR:
> remote: ERROR:  To register an email address, please visit:
> remote: ERROR:  https://gerrit.ovirt.org/#/settings/contact
> remote:
> remote:
> To ssh://ybron...@gerrit.ovirt.org:29418/vdsm
>  ! [remote rejected] HEAD -> refs/for/master (invalid author)
> error: failed to push some refs to
> 'ssh://ybron...@gerrit.ovirt.org:29418/vdsm'
>
> We must have permissions to do that, this is part of the rebasing part, and
> I think its fine to fix patches on behalf of someone else.. but its not the
> best practice for reviewing.
> anyway, please undo this change, unless its something that related only to
> my env.. let me know
>
> Thanks
>
> --
> Yaniv Bronhaim.
>
> _______
> Infra mailing list
> Infra@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra
>



-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Vdsm tests are 4X times faster on travis

2016-12-04 Thread Barak Korren
On 3 December 2016 at 21:36, Nir Soffer  wrote:
> HI all,
>
> Watching vdsm travis builds in the last weeks, it is clear that vdsm tests
> on travis are about 4X times faster compared with jenkins builds.
>
> Here is a typical build:
>
> ovirt ci: 
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc24-x86_64/5101/consoleFull
> travis ci: https://travis-ci.org/nirs/vdsm/builds/179056079
>
> The build took 4:34 on travis, and 19:34 on ovirt ci.

Interesting, thanks for looking at this!

>
> This has a huge impact on vdsm maintainers. Having to wait 20 minutes
> for each patch
> means that we must ignore the ci and merge and hope that previous tests 
> without
> rebasing on master were good enough.
>
> The builds are mostly the same, expect:
>
> - In travis we don't check if the build system was changed and
> packages should be built
>   takes 9:18 minutes in ovirt ci.

Well, I guess the infra team can't help with that, but still, is there
anything we could do at the infrastructure level to speed this up?

> - In travis we don't clean or install anything before the test, we use
> a container with all the
>   available packages, pulled from dockerhub.
>   takes about 3:52 minutes in ovirt ci

Well, I guess this is where we (the infra team) should pay attention.
We do have a plan to switch from mock to Docker at some point
(OVIRT-873 [1]), but it'll take a while until we can make such a large
switch.

It the meantime there may be some low-hanging fruit we can pick to
make things faster. Looking at the same log:

16:03:28 Init took 77 seconds

16:05:50 Install packages took 142 seconds

We may be able to speed those up - looking at the way muck is
configured, we may be running with its caches turned off (I'm not yet
100% sure about this - muck_runner.sh is not the simplest script...).
I've created OVIRT-902 [2] for us to look at this.

> - In travis we don't enable coverage. Running the tests with coverage
> may slow down the tests
>   takes 5:04 minutes in ovirt ci
>   creating the coverage report takes only 15 seconds, not interesting

We can easily check this by just sending a patch with coverage turned
on and then sending another patch set for the same patch with coverage
turned off.

> - In travis we don't cleanup anything after the test
>   this takes 34 seconds in ovirt ci

We can look at speeding this up - or perhaps just change things so
that results are reported as soon as check_patch.sh is done as opposed
to when the Jenkins job is done.
There may be some pitfalls here so I need to think a little more
before I recommend going down this path.

> The biggest problem is the build system check taking 9:18 minutes.
> fixing it will cut the build time in half.

Please try fixing that, or maybe this should just move to build_artifacts.sh?

[1]: https://ovirt-jira.atlassian.net/browse/OVIRT-873
[2]: https://ovirt-jira.atlassian.net/browse/OVIRT-902

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Vdsm tests are 4X times faster on travis

2016-12-04 Thread Barak Korren
> To debug this we need to get a shell on a jenkins slave with the exact
> environment of
> a running job.

Perhaps try to check if this reproduces with mock_runner.sh.

You can try running with with something like:

JENKINS=
cd 
$JENKINS/mock_configs/mock_runner.sh --patch-only \
  --mock-confs-dir $JENKINS/mock_configs "fc24.*x86_64"



-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Vdsm tests are 4X times faster on travis

2016-12-05 Thread Barak Korren
> Here are builds that do not change the build system:
> - 
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc24-x86_64/5767/console:
> 10:07
> - 
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/4157/console:
> 10:16
>
> So we about 2X times faster now.

Awesome! also for fc24:

22:11:22 Init took 73 seconds

22:13:45 Install packages took 143 seconds

So 3m 36s, our pending patches can probably bring that down to around
20s. That will get us to around 7m...
Maybe we could shave some more seconds off by optimizing the git clone
and making some of the cleanups happen less frequently.
(It seems we spend 16s total outside of mock_runner.sh, so perhaps not
much to gain there).

So any more ideas where we can get extra 2-3m?

Things we didn`t try yet:
1. Ensure all downloads happen through the proxy (there is a patch
pending, but some tweaking in check_patch.sh may be needed as well)
2. Run mock in tmpfs (it has a plugin for that)
3. Avoid setting some FS attributes on files (mock is configured for
that but we don't install the OS package needed to make that actually
work)

Nut sure any of the above will provide significant gains though.

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Vdsm tests are 4X times faster on travis

2016-12-05 Thread Barak Korren
On 5 December 2016 at 10:07, Nir Soffer  wrote:
>
> 20 seconds setup sounds great.
>
> Can we try the patches with vdsm builds?

We'll probably merge today, if not, I'll manually cherry-pick this for
the vdsm jobs.



-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: engine 3.6 build fails

2016-12-06 Thread Barak Korren
On 6 December 2016 at 10:16, Barak Korren  wrote:
>>
>> I think that the issue could be related to this change:
>> https://gerrit.ovirt.org/#/c/67801
>>
>> adding Barak
>
> Hmm... yes my bad... it has to do with the el6 chroots using
> "groupsinstall" to setup things instead of "install"... I'll try to
> see if I can work around that... (I'm thinking I can get away with
> moving el6 to use "install" b/c the 'yum' doing the setup is the one
> on the host which should be el7 or newer)

Here, this should fix this:
https://gerrit.ovirt.org/#/c/67860/

Lets merge it quickly (and then I'll have to erase the mock cache on
the slave to get the chroot rebuilt)

One more thing - please make sure to address such emails to infra as
well next time so other infra members know what is going on...

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: engine 3.6 build fails

2016-12-06 Thread Barak Korren
>
> Here, this should fix this:
> https://gerrit.ovirt.org/#/c/67860/
>
> Lets merge it quickly (and then I'll have to erase the mock cache on
> the slave to get the chroot rebuilt)

Ok.
Patch merged and cache erased - please try rerunning the engine build.


-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: false CI -1

2016-12-07 Thread Barak Korren
Same issue was previosly reported:

https://ovirt-jira.atlassian.net/browse/OVIRT-909

we have merged a patch to fix this, and manually cleaned the relevant slave.

On 7 December 2016 at 16:43, Martin Mucha  wrote:
> http://jenkins.ovirt.org/job/ovirt-engine_master_check-patch-el7-x86_64/14430/
>
> - Original Message -
>>
>> Hi,
>> this seems as false ci -1
>>
>> http://jenkins.ovirt.org/job/ovirt-engine_master_find-bugs_created/9180/console
>>
>> M.
>>
> ___
> Infra mailing list
> Infra@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra



-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Vdsm tests are 4X times faster on travis

2016-12-07 Thread Barak Korren
On 7 December 2016 at 14:15, Nir Soffer  wrote:
> On Mon, Dec 5, 2016 at 10:11 AM, Barak Korren  wrote:
>> On 5 December 2016 at 10:07, Nir Soffer  wrote:
>>>
>>> 20 seconds setup sounds great.
>>>
>>> Can we try the patches with vdsm builds?
>>
>> We'll probably merge today, if not, I'll manually cherry-pick this for
>> the vdsm jobs.
>
> With all patches merged, builds take now:
>
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/4373/console
> 00:07:13.065 Finished: SUCCESS
>
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc24-x86_64/5991/console
> 00:08:19.035 Finished: SUCCESS
>
> Last week it was 19-20 mintues, great improvement!



>
> What is missing now small improvement in the logs - add log for each part of 
> the
> build with the time it took.
>
> - setup
> - run script
> - cleanup
>
> The run script part is something that only project maintainers can optimize, 
> the
> rest can be optimized only by ci maintainers.

WRT to times - mock_runner.sh does print those out to the output page
(surrounded by many asterisks and other symbols...)

WRT separation of logs - we already mostly have that, you can see
individual step logs in the job artifacts. For example for one of the
jobs above:

http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc24-x86_64/5991/artifact/exported-artifacts/logs/mocker-fedora-24-x86_64.fc24.check-patch.sh/check-patch.sh.log

But yeah, we could probably make the output in the main Jenkins output
page better. That would require potentially breaking changes to
"mock_runner.sh", so I'd rather focus on ultimately replacing it...

We do already have https://ovirt-jira.atlassian.net/browse/OVIRT-682
for discussing this issue.

> I think we should have metrics collection system and keep these times
> so we can detect regressions and improvement easily. But the first step
> is measuring the time.

Jenkins (partially) gives us that, see here:
http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc24-x86_64/buildTimeTrend

I completely agree that the UX here is not as good as it can and
should be, and we do have plans to make it A LOT better, please bare
with us in the meantime...

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Vdsm tests are 4X times faster on travis

2016-12-07 Thread Barak Korren
On 7 December 2016 at 14:15, Nir Soffer  wrote:
>
> Last week it was 19-20 mintues, great improvement!
>
there are a couple of other things we might try soon, that will,
perhaps, help us shave off another 2-3 minutes...




-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: CI slaves extremely slow - overloaded slaves?

2016-12-07 Thread Barak Korren
More discussion and tracking needed, moving to Jira. (please put any
further discussion there):

https://ovirt-jira.atlassian.net/browse/OVIRT-919

On 7 December 2016 at 21:33, Nir Soffer  wrote:
> Hi all,
>
> In the last weeks we see more and more test failures due to timeouts in the 
> CI.
>
> For example:
>
> 17:19:49 
> ==
> 17:19:49 FAIL: test_scale (storage_filesd_test.GetAllVolumesTests)
> 17:19:49 
> --
> 17:19:49 Traceback (most recent call last):
> 17:19:49   File
> "/home/jenkins/workspace/vdsm_master_check-patch-fc24-x86_64/vdsm/tests/storage_filesd_test.py",
> line 165, in test_scale
> 17:19:49 self.assertTrue(elapsed < 1.0, "Elapsed time: %f seconds"
> % elapsed)
> 17:19:49 AssertionError: Elapsed time: 1.105877 seconds
> 17:19:49  >> begin captured stdout << 
> -
> 17:19:49 1.105877 seconds
>
> This test runs in 0.048 seconds on my laptop:
>
> $ ./run_tests_local.sh storage_filesd_test.py -s
> nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$']
> storage_filesd_test.GetAllVolumesTests
> test_no_templates   OK
> test_no_volumes OK
> test_scale  0.047932 
> seconds
> OK
> test_with_template  OK
>
> --
> Ran 4 tests in 0.189s
>
> It seems that we are overloading the CI slaves. We should not use nested kvm
> for the CI, such vms are much slower then regular vms, and we probably run
> too many vms per cpu.
>
> We can disable such tests in the CI, but we do want to know when there is
> a regression in this code. Before it was fixed, the same test took 9 seconds
> on my laptop. We need fast machines in the CI for this.
>
> Nir
> ___
> Infra mailing list
> Infra@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra



-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [ovirt-devel] 4.0.x dependency failure (vdsm-jsonrpc-java)

2016-12-11 Thread Barak Korren
On 11 December 2016 at 11:18, Eyal Edri  wrote:
> Adding infra as well.
>
> I see a very strange thing happening on the build artifacts jobs:
>
> On [1] we see a successful build of 1.2.10 built in build-artifacts, but on
> the 2 patches merged after it, its back to 1.2.9 [2].
> Is it possible that the 2 patches merged after the version bump weren't
> rebased on the version branch and were built using older code?

Is build_artifacts running before or after merge?
This is impossible if its running after.

> The project has 'fast forward only' mode in Gerrit.

If build_artifacts runs on pre-merge code that even this will not help.

Bottom line - build_artifacs should never ever run on unmerged patches.


-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/infra


Re: [ovirt-devel] 4.0.x dependency failure (vdsm-jsonrpc-java)

2016-12-11 Thread Barak Korren
On 11 December 2016 at 16:01, Eyal Edri  wrote:
>
>
> AFAIK all of these builds were triggered on merged patches.

We need to see which code its cloning - I suspect it is still trying
to clone the patch instead of the master branch, this may indeed be a
bug in how we configure the git plugin to work with the Gerrit
trigger.
(It just occurred to me it may need to be configured differently fro
pre and post-merge jobs)

We will need to investigate this more deeply (compare git hashes jobs
get before and after merge, etc.)

But if its a ff-only repo, this means that you can't merge a patch
that needs a rebase, so if patches were rebased, and build_artifacts
runs post-merge, it should be impossible to get it to run on
non-rebased code.

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/infra


Re: Why experimental is failing

2016-12-12 Thread Barak Korren
בתאריך 12 בדצמ׳ 2016 12:47 PM,‏ "Eyal Edri"  כתב:

Looks like the issue is in the following code:

# disable any other repos to avoid downloading metadata
yum install --disablerepo=\* --enablerepo=alocalsync -y yum-utils
yum-config-manager --disable \*
yum-config-manager --enable alocalsync

in ost common/deploy-scripts/add_local_repo.sh

Might be happening due to new yum available now from CentoOS 7.3...

Do we really need to disable all repos when installing yum-utils?


alocalsync is probably lago's internal repo. so we need to only use it.

We probably just need some more deps in reposync conf to resolve this.

But why are we installing yum-utils on engine at all?



-- Forwarded message --
From: 
Date: Mon, Dec 12, 2016 at 12:39 PM
Subject: [oVirt Jenkins] test-repo_ovirt_experimental_4.0 - Build #3547 -
FAILURE!
To: infra@ovirt.org


Build: http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_4.0/3547/,
Build Number: 3547,
Build Status: FAILURE
___
Infra mailing list
Infra@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/infra




-- 
Eyal Edri
Associate Manager
RHV DevOps
EMEA ENG Virtualization R&D
Red Hat Israel

phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)

___
Infra mailing list
Infra@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/infra
___
Infra mailing list
Infra@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/infra


Re: Why experimental is failing

2016-12-12 Thread Barak Korren
On 12 December 2016 at 13:43, Michal Skrivanek
 wrote:
>
>
> But why are we installing yum-utils on engine at all?
>
>
> not sure why on engine, but it is needed anyway on hosts

Again, why? yum-utils are things like reposync, and repoclosure, so
should not be needed normally.

In any case, like I wrote before, we're probably seeing a breaking
dep-change in CentOS due to 7.3 update, so better just add what's
missing to the reposync conf.

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/infra


Re: lago failure

2016-12-12 Thread Barak Korren
On 12 December 2016 at 13:45, Martin Polednik  wrote:
>
> + systemctl start rpcbind.service
> Failed to start rpcbind.service: Unit rpcbind.service failed to load: No
> such file or directory.
>
> Could you do something about the problems?
>

Looks like a breaking change in CentOS, we'll need to figure out what
to do to get NFS to work instead of restarting RPCBIND (Or perhaps we
have a missing package).


-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/infra


Re: http://lists.ovirt.org/pipermail/announce/ is not reachable

2016-12-12 Thread Barak Korren
It can be reached now at lists.phx.ovirt.org (but we we need a CNAME
from the old name)

On 12 December 2016 at 15:03, Eyal Edri  wrote:
> I'm not sure who gets emails sent to infra-owner list.  best to send it to
> infra or infra support.
>
> Adding Evgheni and Duck who are working on migration of the mm server.
>
> On Dec 12, 2016 2:58 PM, "Sandro Bonazzola"  wrote:
>>
>> The lists server is not reachable anymore. Looks like a dns issue:
>>
>> $ traceroute lists.ovirt.org
>> lists.ovirt.org: Name or service not known
>>
>>
>> --
>> Sandro Bonazzola
>> Better technology. Faster innovation. Powered by community collaboration.
>> See how it works at redhat.com
>
>
> ___
> Infra mailing list
> Infra@ovirt.org
> http://lists.phx.ovirt.org/mailman/listinfo/infra
>



-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/infra


Re: mirrors errors

2016-12-12 Thread Barak Korren
On 12 December 2016 at 17:27, Anton Marchukov  wrote:
> Actually Evgheni found the reason for that [1].

...

> we verify repoproxy availability in mock_genconfig and if it is not
> available - use default configs instead of proxied.

You could've asked me, I explained exactly this on one of the threads
that discuss these failures.
Its not using the default configs though - just the none-proxies ones,
those conf files are managed by us as well.

> This essentially avoids
> all of our improvement patches. Guess that code is not needed anymore and
> can be safely removed.

Neah, its good to have around so we're resilient to proxy server failures.
Since the check is based on the URL in the *-proxed.conf file, it
means that when you change it to not use repoproxy, this check will
not use it as well.
___
Infra mailing list
Infra@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/infra


Re: Ability to run [job-name] when a tag is pushed to Gerrit

2016-12-12 Thread Barak Korren
On 12 December 2016 at 18:48, Vojtech Szocs  wrote:
>
> this was discussed loong time ago.. how much effort would it take
> for oVirt CI infra to support running e.g. [build-artifacts] when
> a git tag is pushed to Gerrit repo?
>

Its just a YAML patch to copy the existing build_artifacts job, give
it a new name and change the trigger configuration.

But the trigger may be a little tricky, since the Jenkins Gerrit
trigger does not have a specific event for tags, instead it has a
"Reference updated" event. So we may need to play a little with
filtering to ensure this gets called only for tag updates and not for
branch updates...

I guess you will need our (infra) help, do we have a Jira ticket to track this?

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/infra


Re: [OVIRT CI] Tests succeeded, build failed

2016-12-13 Thread Barak Korren
On 13 December 2016 at 22:56, Nir Soffer  wrote:
>
> Barak, do you think we can change the script so setup and cleanup failures are
> not treated as build failures but build errors?

Doing that means assuming that the cleanup script, that runs
flawlessly 100** times a day for all patches in all projects suddenly
failed for reasons that have nothing to do with the patch that was
just tested.
I think we err on the right side of caution now...

> In travis such failure seem to start another build automatically,
> making developers life much nicer.

And in VMware there is no SPM, you can mix local and remote storages
on the same nodes, and upload images to the storage domain from the
GUI, making the admin's life much nicer. What is your point again?

On a more serious note, we may do auto-rerunning at some point, but we
need to re-engineer most of the standard-CI system to make that
happen, so it'll take a while.

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/infra


Re: [OVIRT CI] Tests succeeded, build failed

2016-12-14 Thread Barak Korren
>
> I kept this build forever so people can inspect it:
> http://jenkins.ovirt.org/job/vdsm_master_check-patch-fc24-x86_64/6291/
>
Looking closer at this, there are two things here we may be able to address:

1. The cleanup script fails to un-mount a filesystem that is already not
   mounted. We can probably easily fix that [1].
2. The reason the cleanup script was trying to cleanup stuff in the 1st place
   was because a major mess was left around on the node by the vdsm
   check_merged job that ran on it prior to this job [2]. The check_merged job
   failed in such a way that made the cleanup script not run at all. I'm still
   not sure what it the root cause of that, so we'll need to further
   investigate.

I will make a quick fix for #1 so that #2 failures do not cascade into
other jobs until we can figure out why is it happening.

[1]: https://ovirt-jira.atlassian.net/browse/OVIRT-937
[2]: https://ovirt-jira.atlassian.net/browse/OVIRT-938

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.phx.ovirt.org/mailman/listinfo/infra


Re: Ability to run [job-name] when a tag is pushed to Gerrit

2016-12-15 Thread Barak Korren
On 14 December 2016 at 19:18, Vojtech Szocs  wrote:
>
> If I'm reading [a] correctly, Jenkins Gerrit trigger (Jenkins plugin)
> doesn't explicitly support "tag pushed" event, which is strange (?),
> there is only "Ref Updated" which should include tag pushes.

Its not the Gerrit trigger that is at fault here, its just the way the
event that Gerrit sends looks like. Which is to mean, Gerrit doesn't
have a specific event for tags.

> So if it's not very feasible to implement e.g. "on-tag-push" trigger,
> I'm OK with [2] - posting Gerrit comment on (already merged & tagged)
> patch to re-trigger the build-artifacts job.

Well, it may be feasible, but we don`t have any pre-existing
experience doing that, so it'll take some trial and error to get done
right, and perhaps some limitations may need to be in place (For
example, on the format that tag string can have, so we can
differentiate if from branches).


-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


oVirt master experimental system tests now failing on snapshot merge

2016-12-18 Thread Barak Korren
After we've fixed the various system test issues that arose from the
CentOS 7.3 release, we're now seeing a new failure that seems to has
to do with snapshot merges.

I'm guessing this may have to do with something the went in last week
while we "weren't looking".

Failing job can be seen here:
http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/4275

The test code snippet that is failing is as follows:

226 api.vms.get(VM0_NAME).snapshots.list()[-2].delete()
227 testlib.assert_true_within_short(
228 lambda:
229 (len(api.vms.get(VM0_NAME).snapshots.list()) == 2) and
230 (api.vms.get(VM0_NAME).snapshots.list()[-1].snapshot_status
231  == 'ok'),
232 )

The failure itself is a test timeout:

  ...
  File 
"/home/jenkins/workspace/test-repo_ovirt_experimental_master/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
line 228, in snapshot_merge
lambda:
  File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line
248, in assert_true_within_short
allowed_exceptions=allowed_exceptions,
  File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line
240, in assert_true_within
raise AssertionError('Timed out after %s seconds' % timeout)
AssertionError: Timed out after 180 seconds

Engine log generated during the test can be found here:

http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/4275/artifact/exported-artifacts/basic_suite_master.sh-el7/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity.py/lago-basic-suite-master-engine/_var_log_ovirt-engine/engine.log

Please have a look.

Thanks,
Barak.


-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: oVirt master experimental system tests now failing on snapshot merge

2016-12-18 Thread Barak Korren
On 18 December 2016 at 17:26, Nir Soffer  wrote:
> On Sun, Dec 18, 2016 at 4:17 PM, Barak Korren  wrote:

> We a lot of these errors in the rest of the log. This meas something
> is wrong with this vg.
>
> Needs deeper investigation from storage developer on both engine and vdsm 
> side,
> but I would start by making sure we use clean luns. We are not trying
> to test esoteric
> negative flows in the system tests.

Here is the storage setup script:
https://gerrit.ovirt.org/gitweb?p=ovirt-system-tests.git;a=blob;f=common/deploy-scripts/setup_storage_unified_he_extra_iscsi_el7.sh;hb=refs/heads/master

All storage used in the system tests comes from the engine VM itself,
and is placed on a newly allocated QCOW2 file (exposed as /dev/sde to
the engine VM), so its unlikely the LUNs are not clean.

> Did we change something in the system tests project or lago while we
> were not looking?

Not likely as well:
https://gerrit.ovirt.org/gitweb?p=ovirt-system-tests.git;a=shortlog

ovirt-system-tests project has got its own CI, testing against the
last nigthly (we will move it to last build that passed the tests
soon). So we are unlikely to merge breaking code there. Then again
we're not gating the OS packages so some breakage may have gone in via
CentOS repos...

> Can we reproduce this issue manually with same engine and vdsm versions?

You have several options:
1: Get engine+vdsm builds from Jenkins:
   http://jenkins.ovirt.org/job/ovirt-engine_master_build-artifacts-fc24-x86_64/
   http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-el7-x86_64/
   (Getting the exact builds that went into a given OST run takes tracing
back the job invocation links from that run)

2: Use the latest experimental repo:
   http://resources.ovirt.org/repos/ovirt/experimental/master/latest/rpm/el7/

3: Run lago and OST locally:
   (as documented here:
http://ovirt-system-tests.readthedocs.io/en/latest/
you'd need to pass in the vdsm and engine packages to use)



-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Heads up! Influence of our recent Mock/Proxy changes on Lago jobs

2016-12-23 Thread Barak Korren
Hi infra team members!

As you may know, we've recently changed out proxied Mock configuration
so that the 'http_proxy' environment variable gets defined inside the
Mock environment. This was in an effort to make 'pip', 'curl' and
'wget' commands go through our PHX proxy. As it turns out, this also
have unforeseen influence on yum tools.

Now, when it come to yum, as it is used inside the mock environmet, we
long has the proxied configuration hard-wiring it to use the proxy by
setting it in "yum.conf". However, so far, yum tools (Such as
reposync) that brought their own configuration, essentially bypassed
the "yum.conf" file and hence were not using the proxy.

Well, now it turns out that 'yum' and the derived tools also respect
the 'http_proxy' environment variable [1]:

10.2. Configuring Proxy Server Access for a Single User

To enable proxy access for a specific user, add the lines in the example box
below to the user's shell profile. For the default bash shell, the
profile is
the file ~/.bash_profile. The settings below enable yum to use the proxy
server mycache.mydomain.com, connecting to port 3128.

# The Web proxy server used by this account
http_proxy="http://mycache.mydomain.com:3128";
export http_proxy

This is generally a good thing, but it can lead to formerly unexpected
consequences.

Case-to-point: The Lago job reposync failures of last Thursday (Dec 22nd, 2016).

The root-cause behind the failures was that the
"ovirt-web-ui-0.1.0-4.el7.centos.x86_64.rpm" file was changed in the
"ovirt-master-snapshot-static" repo. Updating an RPM file without
changing the version or revision numbers breaks YUM`s rules and makes
reposync choke. We already knew about this and actually had a
work-around in the Lago code [2].

We I came in Thursday morning, and saw reposync failing in all the
Lago jobs, I just assumed that our work-around simply failed to work.
My assumption was enforced by the fact that I was able to reproduce
the issue by running 'reposync' manually on the Lago hosts, and also
managed to rectify it by removing the offending from file the reposync
cache. I spent the next few hours chasing down failing jobs and
cleaning up the caches on the hosts they ran on. I took me a while to
figure out that I was seeing the problem (Essentially, the older
version of the package file) reappear on the same hosts over and over
again!
Wondering how could that be, and after ensuring the older package file
was nowhere to be found on any of the repos the jobs were using, Me
and Gal took a look at the Lago code to see if it could be causing the
issue. Imagine our puzzlement when we realized the work-around code
was doing _exactly_ what I was doing manually, and still somehow
managed to make the very issue it was designed to solve reappear!
Eventually the problem seemed to disappear on its own. Now, armed with
the knowledge above I can provide a plausible explanation to what we
were seeing.
The difference between my manual executions of 'reposync' and the way
Lago was running it was that Lago was running within Mock, where
'http_proxy' was defined. What was probably happening is that reposync
kept getting the old RPM file from the proxy while still getting a
newer yum metadate file.

Conclusion - The next time such an issue arises, we must make sure to
clear the PHX proxy cache, there is actually no need to clear the
cache on the Lago hosts themselves, because our work-around will
resolve the issue there. Longer term we may configure the proxy to not
cache files coming from resources.ovirt.org.


[1]: https://www.centos.org/docs/5/html/yum/sn-yum-proxy-server.html
[2]: 
https://github.com/lago-project/lago/blob/master/ovirtlago/reposetup.py#L141-L153

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Heads up! Influence of our recent Mock/Proxy changes on Lago jobs

2016-12-23 Thread Barak Korren
On 23 December 2016 at 21:02, Anton Marchukov  wrote:
> Hello Barak.
>
> But why this should be handled on infra side? Was that infra code that
> produced two RPMs with same name and version and different content? If not
> then I would bug it against whoever code is creating such RPMs and it then
> should be rebuild with at least rpm release incremented and hence does not
> require cache invalidation.

I did not query Sandro about why he did the update the way he did. He
knows far more then I do about the various build processes of the
various packages and oVirt, an I tend to trust his judgement.

> Any reason we are not doing that?

This can take time, every maintainer has his own (bad) habits, and not
everyone will agree to do what we want (Some downright regard oVirt as
a "downstream" consumer and refuse to do anything that they regard as
oVirt-specific!).

In the meantime we need to be resilient to such issues if we can. We
can`t just let everything fail while we try to "fix the world".

Also, next time around, we could be seeing similar caching issues with
a non-yum/rpm file, so its good to have a deep understanding of the
data paths into our system.

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Fwd: [oVirt Jenkins] test-repo_ovirt_experimental_4.1 - Build #41 - SUCCESS!

2016-12-25 Thread Barak Korren
\o/

I'm guessing the imageio-proxy fix made it in...


-- Forwarded message --
From:  
Date: 25 December 2016 at 11:31
Subject: [oVirt Jenkins] test-repo_ovirt_experimental_4.1 - Build #41 - SUCCESS!
To: sbona...@redhat.com, infra@ovirt.org


Build: http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_4.1/41/,
Build Number: 41,
Build Status: SUCCESS
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra



-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: CI unable to fetch changes from git

2016-12-25 Thread Barak Korren
Should be fixed now.

On 25 December 2016 at 10:29, Eyal Edri  wrote:
> There is a problem with git daemon on Gerrit server,  we're looking into it
> from the morning,  until resolved,  http/https access should work.
>
> Hopefully the issue should be fixed soon.
>
> We'll update once we have more info.
>
> On Dec 25, 2016 10:25 AM, "Allon Mureinik"  wrote:
>>
>> I've been seeing similar messages in all the recently updated patches on
>> gerrit, regardless of which project they belong to:
>>
>> 08:16:47 Fetching upstream changes from
>> git://gerrit.ovirt.org/ovirt-engine.git
>> 08:16:47  > git --version # timeout=10
>> 08:16:47  > git -c core.askpass=true fetch --tags --progress
>> git://gerrit.ovirt.org/ovirt-engine.git refs/changes/74/69074/1 --prune
>> 08:16:47 ERROR: Error fetching remote repo 'origin'
>> 08:16:47 hudson.plugins.git.GitException: Failed to fetch from
>> git://gerrit.ovirt.org/ovirt-engine.git
>> 08:16:47 at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:766)
>> 08:16:47 at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1022)
>> 08:16:47 at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1053)
>> 08:16:47 at
>> org.jenkinsci.plugins.multiplescms.MultiSCM.checkout(MultiSCM.java:129)
>> 08:16:47 at hudson.scm.SCM.checkout(SCM.java:485)
>> 08:16:47 at
>> hudson.model.AbstractProject.checkout(AbstractProject.java:1269)
>> 08:16:47 at
>> hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:607)
>> 08:16:47 at
>> jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
>> 08:16:47 at
>> hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
>> 08:16:47 at hudson.model.Run.execute(Run.java:1738)
>> 08:16:47 at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
>> 08:16:47 at
>> hudson.model.ResourceController.execute(ResourceController.java:98)
>> 08:16:47 at hudson.model.Executor.run(Executor.java:410)
>> 08:16:47 Caused by: hudson.plugins.git.GitException: Command "git -c
>> core.askpass=true fetch --tags --progress
>> git://gerrit.ovirt.org/ovirt-engine.git refs/changes/74/69074/1 --prune"
>> returned status code 128:
>> 08:16:47 stdout:
>> 08:16:47 stderr: fatal: unable to connect to gerrit.ovirt.org:
>> 08:16:47 gerrit.ovirt.org[0: 107.22.212.69]: errno=Connection refused
>>
>> Can someone take a look please?
>>
>>
>> Thanks!
>>
>>
>> ___
>> Infra mailing list
>> Infra@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/infra
>>
>
> ___
> Infra mailing list
> Infra@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra
>



-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Future of the oVirt website

2017-01-06 Thread Barak Korren
On 6 January 2017 at 07:13, Marc Dequènes (Duck)  wrote:
> Quack,
>
> So I just discovered this thread:
>   http://lists.ovirt.org/pipermail/devel/2017-January/029097.html
>
> First, it would be nice if the infra team was involved directly, because
> not everyone is also an oVirt developer (and on this list). Also there
> are already plans to improve the site system and build and this
> side-initiative feels like an unexpected and rude disruption of energy
> already invested.

We (as in I) did get involved here:
http://lists.ovirt.org/pipermail/devel/2017-January/029103.html

Its important to understand where people are coming from and what do
they really want and then come up with the proper tools and processes.

I think I agree with you that the current flow is best suited for
maintaining www.ovirt.org is a public documentation and marketing
site. What the developers are discussing is a different requirement,
that, IMO is not suited to the main public site.

The developers have a process where, when creating a new feature, they
first create a feature page on the (so called) "wiki", and then use it
in discussions about the feature.
This is useful for developers, but the end result is a mountain of
half-edited and out of date "feature pages" that provide (IMO) very
poor service to the community.

So, in summary, the current flow does not fit what the developers want
(fast, easy, editing of feature design documents), and OTOH the
documents the developers make probably do not belong on the main site
in the form they are made, and had been put there in the past just
because the main site used to be a wiki.

So, IMO, we better just let the developers vent off, most of them do
not contribute public consumable documentation anyway, and they will
probably just find another place to put the design documents and all
will be well.

> It seems people forget how things were in the past, which leads to going
> back and forth between a new solution and the previous one. People wish
> for an easy way to contribute, and this is a legitimate goal. After some
> time an easy solution make things complicated because it is such a mess
> and there is no review, so no quality checks, and people wish to have
> workflows. Then they find it to cumbersome and wish to go back to a
> marvelous past. And so on again and again.

No, you misinterpret their goals, they are not looking to contribute
to the main site, they are looking to discuss and develop new
features. That flow never belonged on the main site, and the sooner
everyone understands that the better.

> This said, this does not mean the current solution is perfect and we
> should not think about a better one, but we should recall why we
> abandoned a wiki to switch to the current solution, so we don't fall
> into the same traps.

There is no real conflict here, just a misunderstanding. The current
flow is very good for what OSAS wants to do, which is to maintain a
high-quality public website. Its just not as good for what the
developers want, which is a place to type their half-ideas into. Those
needs do not really collide, the developers just need to go somewhere
else anyway.


> What I can say on the topic is that migrating is painful, so we should
> be cautious. OSAS is not here to force a solution upon you, but the
> infra team (and the OSAS folks too), have a limited workforce to
> dedicate to this project, so let's make something realistic. Also we
> just finish another pass of cleanup of the current site, with migration
> bugs from the previous Mediawiki solution, so keep in mind it would
> probably take _years_ to really get something clean. Who's gonna do this?

No one. I don't see this discussion as a cause for migration.
The developers can keep talking, as long as none of then volunteers to
do any real work about this, there is no risk of things changing (As
you say, OSAS and the infra team has enough other work).

> I also wanted to say I totally disagree on someone's remark (somewhere
> in the thread) about doc not being as important as code. A lot of
> content is obsolete or mistaken in the current site already, and this
> means giving a very bad image of the project, raising the number of
> silly questions people come to bother you with on ML or IRC, so I think
> doc should really be taken seriously. As a user it is often I have to
> dig in the code to find undocumented features, or why a documented one
> does not work as said, and that's fu^Wutterly boring.

This is another discussion altogether, and a worthy one, oyu should
probably reply that directly (on the ML) to the developer who did that
remark, I'm sure most other developers will agree with you.


-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [ovirt-devel] [ovirt-users] Lowering the bar for wiki contribution?

2017-01-08 Thread Barak Korren
On 8 January 2017 at 10:17, Roy Golan  wrote:
> Adding infra which I forgot to add from the beginning

I don't think this is an infra issue, more of a community/working
procedures one.

I'm going to state my view on this, the same one I've stated before,
with hopes of reaching a conclusion that will be beneficial to
everyone.

The core of the issue here is that we have two conflicting needs.

On the one hand, the developers need a place where they create and
discuss design documents and road maps. That please needs to be as
friction-free as possible to allow developers to work on the code
instead of on the documentation tools.

On the other hand, the user community needs a good, up to date source
of information about oVirt and how to use it. To keep the information
quality high, there is a need for some editorial process. The OSAS
team stepped in to offer editorial services and is doing tremendous
work behind the scenes. This creates the desire to use the same tools
and flows that OSAS is using for other projects.

One thing that I don't thing is a hard requirement, is to have the
design documents on the main oVirt website. In fact, I think that the
(Typically outdated) design documents being stored on the main site,
and linked to as "features" had been a major source of confusion for
users.

This all leads me to the conclusion that the currently offered
solution, of using the GitHub project wiki feature may indeed be the
appropriate solution for this (Even though I personally resent the
growing dependency of open source projects on proprietary
closed-source online platforms).

Having said the above, I don't think the site project's wiki is the
best place for this. The individual project mirrors on GitHub may be
better for this (I've enabled the wiki for engine [1] and vdsm [2] for
example). For system-wide features, I suppose OST [2] will be
appropriate, with hopes that this will provide some incentive to write
tests for those features.

[1]: https://github.com/oVirt/ovirt-engine/wiki
[2]: https://github.com/oVirt/vdsm/wiki
[3]: https://github.com/oVirt/ovirt-system-tests/wiki

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Future of the oVirt website

2017-01-08 Thread Barak Korren
> All of the above makes the contribution barrier high. With that being said,
> we should probably arrange a cleanup hackaton to
> remove, update all the pages.
>
Lets not create parallel threads, please keep discussion on the main
thread in devel.

Having contributed a blog post the the main site, I don`t find the
barrier to be as high as you say it is (I did it without even cloning
the repo). But again, lets discuss this in the main thread, not here.

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: When trigged manually the check-merged job

2017-01-09 Thread Barak Korren
On 9 January 2017 at 11:32, Yaniv Bronheim  wrote:
> When trigged manually the check-merged job for specific patch we don't see
> the link in the patch History - see in https://gerrit.ovirt.org/#/c/69802/3
>
> it used to be different .. can we get the link back?

The replies to Gerrit are done by the Gerrit trigger plugin. Its it is
not used to trigger the job, then it also does not (and never did)
report to Gerrit.

Our Jenkins is configured with complex triggering mechanisms, please
avoid triggering jobs manually. If you want to make jobs run , you can
emulate Gerrit events with the "Query and Trigger Gerrit patches"
screen here:
http://jenkins.ovirt.org/gerrit_manual_trigger/

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Build failed in Jenkins: ovirt_4.0_he-system-tests #627

2017-01-10 Thread Barak Korren
On 10 January 2017 at 12:23, Yaniv Kaul  wrote:
>
>
> On Tue, Jan 10, 2017 at 12:08 PM, Lev Veyde  wrote:
>>
>> This patch is one that caused it probably:
>>
>> https://github.com/lago-project/lago/commit/05ccf7240976f91b0c14d6a1f88016376d5e87f0
>
>
> +Milan.
>
> I must confess that I did not like the patch to begin with...
> I did not understand what real problem it solved, but Michal assured me
> there was a real issue.
> I know have Engine with a Java@ 100% CPU - I hope it's unrelated to this as
> well.
>
> I suggest we do survey to see who doesn't have SandyBridge and above and
> perhaps move higher than Westmere.
> What do we have in CI?

+Evgheny



-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Please rebase your patches to the 'jenkins' repo

2017-01-16 Thread Barak Korren
Hi,

I've recently changed the check-patch jobs that are running on the
'jenkins' repo to require the version of 'mock_runner.sh' that was
created in the following patch:

https://gerrit.ovirt.org/#/c/69250/

That patch was already merged a few days ago. Please make sure all
your patches to the 'jenkins' repo are based on to of it or on top of
a recent 'master' that includes it. Failing to do this will cause your
check-patch jobs to fail showing the 'mock_runner.sh' usage message.

Thanks,

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: CI gives +1 on gerrit as a response to 'ci please build'.

2017-01-16 Thread Barak Korren
בתאריך 16 בינו׳ 2017 04:14 PM,‏ "Eyal Edri"  כתב:



On Mon, Jan 16, 2017 at 4:05 PM, Gil Shinar  wrote:

> Edy showed me the issue. I think we haven't thought about that.
> 'ci please build' actually executes build artifacts jobs and when they
> succeed, they change the "continuous integration" flag to +1.
>
> Can we control it from the Gerrit?
>

I couldn't find any reference for response from Jenkins to Grrit in the
yaml templates in std CI, so not sure its configurable per job, it might be
on the main Jenkins configuration.
Barak - do you know where the grading is defined for Gerrit Trigger in
YAML?



Yeah we should be able to do it from the trigger YAML.
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: CI gives +1 on gerrit as a response to 'ci please build'.

2017-01-16 Thread Barak Korren
On 16 January 2017 at 16:14, Eyal Edri  wrote:
>
>
> On Mon, Jan 16, 2017 at 4:05 PM, Gil Shinar  wrote:
>>
>> Edy showed me the issue. I think we haven't thought about that.
>> 'ci please build' actually executes build artifacts jobs and when they
>> succeed, they change the "continuous integration" flag to +1.
>>
>> Can we control it from the Gerrit?

No need we can do it from the job.

> I couldn't find any reference for response from Jenkins to Grrit in the yaml
> templates in std CI, so not sure its configurable per job, it might be on
> the main Jenkins configuration.
> Barak - do you know where the grading is defined for Gerrit Trigger in YAML?

Yeah you can use 'skip-vote" [1].
This requires Gerrit Trigger Plugin version >= 2.7.0

[1]: 
http://docs.openstack.org/infra/jenkins-job-builder/triggers.html#triggers.gerrit





-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [lago-devel] Lago v0.32 released

2017-01-17 Thread Barak Korren
On 18 January 2017 at 00:41, Nadav Goldin  wrote:

> Hi, as luckily this bug was discovered early, and mock cache refreshes
> once in 2 days, the only failure I could spot due to that is here[1],
> so to be sure Lago v0.32 won't get installed on the slaves, I cleaned
> up manually the mock cache on all ovirt-srv*, so on next run they
> should pull v0.33.
>

To avoid having to play with mock caches in the future and remove the
potential to break OST on Lago updates, I suggest we specify an exact lago
version on the OST *.packages files.
This will allow using the OST check_patch job as essentially a gate for
Lago updates. It will also mean that when we do updated Lago, mock caches
will be invalidated immediately.


-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [ovirt-devel] planned Jenkins restart

2017-01-21 Thread Barak Korren
On 20 January 2017 at 19:56, Evgheni Dereveanchin 
wrote:

> Maintenance completed, Jenkins back up
> and running. As always - if you see any
> issues please report them to Jira.
>
> Among other things, two new plugins were
> installed today:
>
> - Test Results Analyzer
>A plugin that shows history of test execution
>results in a tabular format.
>
>
\o/

http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/test_results_analyzer/




> - Embeddable Build Status
>This plugin allows Jenkins to expose the current
>status of your build as an image in a fixed URL.
>You can put this URL into other sites (such as GitHub
>README) so that people can see the current
>state of the job (last build) or for a specific build.
>
>
\o/

http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/badge/


-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


New image releases and image file names

2017-01-22 Thread Barak Korren
Hi there,

In the oVirt projects we're making some efforts to make it easy to get
stuff up and running on top of oVirt.

One of these efforts is maintaining a Glance server containing updated
"cloud" VM images of various distros including, among others, Fedora,
CentOS, Ubuntu and Atomic.

That Glance server is configured for use by default in all oVirt installations.

Since upstream images can update quite often, we've made efforts to
automate the process of polling the disro download servers for new
images, and downloading them into Glance.

In order to make images useful for users, we need to know various
details like which OS version is in the image. Since every distro has
got its own idea about how to organise the download server and name
the files in it, we had to resort to regex-matching the image files to
extract useful details about them.

This is why its very frustrating to us when an upstream changes the
way it names its image files. Especially when the files are named in a
way that prevents keeping an orderly image archive.

As a new CentOS Atomic image was recently released [1], we set about
to getting that image uploaded to our Glance server. We found out the
name of the new image file seems to be very different from how Atomic
images used to be called.

If seems the new image is called simply:
CentOS-Atomic-Host-7-GenericCloud.qcow2

This is while older images were called something like:
CentOS-Atomic-Host-7.1609-GenericCloud.qcow2

The newer name is useless as far as image version tracking goes. We
would like to request that the new naming scheme would be restored, or
at least some other scheme is created where its easy to automatically
figure out the image versions.

[1]: http://www.projectatomic.io/blog/2017/01/centos-atomic-jan17/

Thanks,

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: [atomic] New image releases and image file names

2017-01-23 Thread Barak Korren
On 23 January 2017 at 02:26, Matthew Miller  wrote:
>
> We have a long-standing but hopefully soon-to-be-resolved request for
> an image index file. Sounds like this is another ideal use case. Can
> you please add a comment to https://fedorahosted.org/rel-eng/ticket/5805?
>

I've known of that ticket for a while. I've even opened the equivalent
ticket in CentOS:
https://bugs.centos.org/view.php?id=10730
https://lists.centos.org/pipermail/centos/2016-April/158788.html

Unfortunately I don't seem to be permitted to comment on it (Either
that or I'm too stupid to find the "comment" button, which is
computationally equivalent ;).

Both tickets did not receive response for a long long time so we
(oVirt infra) eventually "took thinks into our own hands" in oVirt and
made scripts that could deal with whatever is out there (Most D/L
servers have _some_ kind of file index).

We even have plans to solve this issue for everyone by making a
virt-builder index file for our public Glance server:
https://ovirt-jira.atlassian.net/browse/OVIRT-819

But as the Atomic project is not currently maintaining anything that
could be interpreted as an image archive, we cannot work with it. It
is a shame because it used to work fine up until a few months ago.

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: false CI -1

2017-01-31 Thread Barak Korren
On 1 February 2017 at 09:03, Martin Mucha  wrote:
> RROR:  Failed to umount
> /var/lib/mock/fedora-24-x86_64-cbc0a2f0986f033688ce66749e6745c0-8306/root/sys.
>
>
> http://jenkins.ovirt.org/job/ovirt-engine_master_check-patch-fc24-x86_64/12144/console
>
>

That is not the cause of the error, this is:

04:54:52 Build timed out (after 360 minutes). Marking the build as failed.




-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Build failed in Jenkins: deploy-to-ovirt_experimental_4.1 #1510

2017-02-02 Thread Barak Korren
9,877::DEBUG::repoman.common.stores.iso.handles_artifact:152::Checking 
> if /job/ovirt-engine-nodejs-modules_4.1_check-merged-el7-x86_64/1/ is an iso
> 2017-02-02 
> 21:37:29,877::DEBUG::repoman.common.stores.iso.handles_artifact:158::  It is 
> not
> 2017-02-02 
> 21:37:29,878::DEBUG::repoman.common.stores.iso.handles_artifact:152::Checking 
> if /job/ovirt-engine-nodejs-modules_4.1_build-artifacts-fc24-x86_64/ is an iso
> 2017-02-02 
> 21:37:29,878::DEBUG::repoman.common.stores.iso.handles_artifact:158::  It is 
> not
> 2017-02-02 
> 21:37:29,878::DEBUG::repoman.common.stores.iso.handles_artifact:152::Checking 
> if /job/ovirt-engine-nodejs-modules_4.1_build-artifacts-fc24-x86_64/1/ is an 
> iso
> 2017-02-02 
> 21:37:29,878::DEBUG::repoman.common.stores.iso.handles_artifact:158::  It is 
> not
> 2017-02-02 
> 21:37:29,879::DEBUG::repoman.common.stores.iso.handles_artifact:152::Checking 
> if /job/ovirt-engine-nodejs-modules_master_check-merged-fc24-x86_64/ is an iso
> 2017-02-02 
> 21:37:29,879::DEBUG::repoman.common.stores.iso.handles_artifact:158::  It is 
> not
> 2017-02-02 
> 21:37:29,879::DEBUG::repoman.common.stores.iso.handles_artifact:152::Checking 
> if /job/ovirt-engine-nodejs-modules_master_check-merged-fc24-x86_64/1/ is an 
> iso
> 2017-02-02 
> 21:37:29,880::DEBUG::repoman.common.stores.iso.handles_artifact:158::  It is 
> not
> 2017-02-02 
> 21:37:29,880::DEBUG::repoman.common.stores.iso.handles_artifact:152::Checking 
> if /job/ovirt-engine-nodejs-modules_master_build-artifacts-fc24-x86_64/ is an 
> iso
> 2017-02-02 
> 21:37:29,880::DEBUG::repoman.common.stores.iso.handles_artifact:158::  It is 
> not
> 2017-02-02 
> 21:37:29,881::DEBUG::repoman.common.stores.iso.handles_artifact:152::Checking 
> if /job/ovirt-engine-nodejs-modules_master_build-artifacts-fc24-x86_64/1/ is 
> an iso
> 2017-02-02 
> 21:37:29,881::DEBUG::repoman.common.stores.iso.handles_artifact:158::  It is 
> not
> 2017-02-02 
> 21:37:29,881::DEBUG::repoman.common.stores.iso.handles_artifact:152::Checking 
> if api/ is an iso
> 2017-02-02 
> 21:37:29,882::DEBUG::repoman.common.stores.iso.handles_artifact:158::  It is 
> not
> 2017-02-02 
> 21:37:29,882::DEBUG::repoman.common.stores.iso.handles_artifact:152::Checking 
> if http://jenkins-ci.org/ is an iso
> 2017-02-02 
> 21:37:29,882::DEBUG::repoman.common.stores.iso.handles_artifact:158::  It is 
> not
> 2017-02-02 
> 21:37:29,883::DEBUG::repoman.common.stores.iso.handles_artifact:152::Checking 
> if # is an iso
> 2017-02-02 
> 21:37:29,883::DEBUG::repoman.common.stores.iso.handles_artifact:158::  It is 
> not
> 2017-02-02 21:37:29,883::DEBUG::repoman.common.parser.parse:73::Checking 
> source KojiURLSource with 
> http://jenkins.ovirt.org/job/ovirt-engine-nodejs-modules_4.1_build-artifacts-el7-x86_64/1/
> 2017-02-02 21:37:29,884::DEBUG::repoman.common.parser.parse:73::Checking 
> source DirSource with 
> http://jenkins.ovirt.org/job/ovirt-engine-nodejs-modules_4.1_build-artifacts-el7-x86_64/1/
> 2017-02-02 
> 21:37:29,884::DEBUG::repoman.common.stores.iso.handles_artifact:152::Checking 
> if 
> http://jenkins.ovirt.org/job/ovirt-engine-nodejs-modules_4.1_build-artifacts-el7-x86_64/1/
>  is an iso
> 2017-02-02 
> 21:37:29,885::DEBUG::repoman.common.stores.iso.handles_artifact:158::  It is 
> not
> 2017-02-02 
> 21:37:29,885::DEBUG::repoman.common.sources.dir.expand:102::Skipping 
> http://jenkins.ovirt.org/job/ovirt-engine-nodejs-modules_4.1_build-artifacts-el7-x86_64/1/
> 2017-02-02 21:37:29,886::ERROR::repoman.common.parser.parse:108::No artifacts 
> found for source 
> http://jenkins.ovirt.org/job/ovirt-engine-nodejs-modules_4.1_build-artifacts-el7-x86_64/1/
> Traceback (most recent call last):
>   File "/usr/bin/repoman", line 10, in 
> sys.exit(main())
>   File "/usr/lib/python2.7/site-packages/repoman/cmd.py", line 455, in main
> exit_code = do_add(args, config, repo)
>   File "/usr/lib/python2.7/site-packages/repoman/cmd.py", line 338, in do_add
> repo.add_source(art_src.strip())
>   File "/usr/lib/python2.7/site-packages/repoman/common/repo.py", line 153, 
> in add_source
> self.parse_source_stream(sys.stdin.readlines())
>   File "/usr/lib/python2.7/site-packages/repoman/common/repo.py", line 193, 
> in parse_source_stream
> self.add_source(line.strip())
>   File "/usr/lib/python2.7/site-packages/repoman/common/repo.py", line 175, 
> in add_source
> artifact_paths = self.parser.parse(artifact_source)
>   File "/usr/lib/python2.7/site-packages/repoman/common/parser.py", line 110, 
> in parse
> raise Exception(msg)
> Exception: No artifacts found for source 
> http://jenkins.ovirt.org/job/ovirt-engine-nodejs-modules_4.1_build-artifacts-el7-x86_64/1/
> 2017-02-02 21:37:29,898::INFO::repoman.common.repo.cleanup:35::Cleaning up 
> temporary dir 
> /srv/resources/repos/ovirt/experimental/.lago_tmp/tmppLHpKt/tmpWsWzgp/tmpOHITGx
> 2017-02-02 21:37:29,899::INFO::repoman.common.repo.cleanup:35::Cleaning up 
> temporary dir 
> /srv/resources/repos/ovirt/experimental/.lago_tmp/tmppLHpKt/tmpWsWzgp
> 2017-02-02 21:37:29,899::INFO::repoman.common.repo.cleanup:35::Cleaning up 
> temporary dir /srv/resources/repos/ovirt/experimental/.lago_tmp/tmppLHpKt
> + rm -f /home/deploy-ovirt-experimental/experimental_repo.lock
> Build step 'Execute shell' marked build as failure
> [ssh-agent] Stopped.
> ___
> Infra mailing list
> Infra@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra



-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


Re: Jenkins jobs ownership

2017-02-05 Thread Barak Korren
On 3 February 2017 at 15:59, Sandro Bonazzola  wrote:
> 8<-
> Sandro, can someone from you team fix this?
>
...
>
> If you want us to maintain this, this code must move into ovirt-imageio
> repository, so we have full control of it.
> >8

There are currently two pieces of information about the project that
are needed to be known in order to build and release it, but are
currently specified in YAML in the 'jenkins' repo, rather then within
the project's own repo:
1. Which platforms should the project be built on
2. Which oVirt releases should particular build of the project be included in.

The fact that this is specified in the 'jenkins' repo **does not place
this outside the maintainers` responsibility**. Things are done this
way currently only because of a technical limitation with the way
Standard-CI jobs are currently created.

We actually have an initiative to move this information to the project
repos. I've started with asking on devel list about how to specify
this as part of Standard-CI [1]. But have received little topical
response so far.

[1]: http://lists.ovirt.org/pipermail/devel/2017-January/029161.html

-- 
Barak Korren
bkor...@redhat.com
RHCE, RHCi, RHV-DevOps Team
https://ifireball.wordpress.com/
___
Infra mailing list
Infra@ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra


  1   2   3   4   5   6   7   8   9   10   >