Re: [ovirt-devel] Gerrit parallel patch handling and CI (Or, why did my code fail post-merge)
On Sun, Nov 20, 2016 at 06:09:59PM +0200, Nir Soffer wrote: > On Sun, Nov 20, 2016 at 5:12 PM, Barak Korren wrote: > >> With the current setting (in vdsm), submitting a series of patches is > >> a huge pain. Sometimes refreshing the page and submitting the next > >> patch in the series works, but sometimes you have to rebase again > >> the next patches in the series, and in the worst cases, you have to > >> do several rebases in the same series. This when the entire series > >> was already rebased properly before the submit. > > > > Actually vdsm is configured to "Cherry Pick" ATM, I'm not sure what > > were the reasons for this, but this should probably be changed to > > ff-only ASAP b/c as it is, it allows patches to be submitted > > completely out-of-order. > > > >> In vdsm we were bitten by this many times, and both Dan and me agree > >> now that fast-forward is the only way. > >> > >> I don't think we need to agree on all projects for this, the whole point > >> of having multiple project is that we don't to agree on every little > >> detail, the project maintainer can do whatever they want. > > > > Ok, so can we get an agreement between the vdsm maintainers to change > > to "ff-only"? > > +1 > > Dan, can you confirm? I enjoyed the freedom of cherry-pick, but after 2 broken nightly builds in the span of 10 days, I give up. Let's try ff-only. Dan. ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] Gerrit parallel patch handling and CI (Or, why did my code fail post-merge)
On Sun, Nov 20, 2016 at 9:18 PM, Sandro Bonazzola wrote: > Il 20/Nov/2016 15:08, "Barak Korren" ha scritto: > > > > Hi there, > > > > I would like to address a concernt that had been raised to us by > > multiple developers, and reach an agreement on how (and if) to remedy > > it. > > > > Lets assume the following situation: > > We have a Git repo in Gerrit with top commit C0 in master. > > On time t0 developers Alice and Bob push patches P1 and P2 respectively > > to master so that we end up with the following situation in git: > > C0 <= P1 (this is Alice`s patch) > > C0 <= P2 (this is Bob`s patch) > > > > On time t1 CI runs for both patches checking the code as it looks for > > each patch. Lets assume CI is successful for both. > > > > On time t2 Alice submits her patch and Gerrit merges it, resulting in > > the following situation in master: > > C0 <= P1 > > > > On time t2 Bob submits his patch. Gerrit, seeing master has changed, > > re-bases the patch and merges it, the resulting situation (If the > > rebase is successful) is: > > C0 <= P1 <= P2 > > > > This means that the resulting code was never tested in CI. This, in > > turn, causes various failures to show up post-merge despite having > > pre-merge CI run successfully. > > > > This situation is a result of the way our repos are currently > > configured. Most repos ATM are configured with the "Rebase If > > Necessary" submit type. This means that Gerrit tries to automatically > > rebase patches as mentioned in t2 above. > > > > We could, instead, configure the repos to use the "Fast Forward Only" > > submit type. In that case, when Bob submits on t2, Gerrit refuses to > > merge and asks Bob to rebase (While offering a convenient button to do > > it). When he does, a new patch set gets pushed, and subsequently > > checked by CI. > > > > I recommend we switch all projects to use the "Fast Forward Only" submit > type. > > > > Thoughts? Concerns? > AFAIR this was enabled for ovirt-engine project in the past and it was pretty impossible to merge any patch with CI+1 when some important dates were near (like feature freeze), because all maintainer tried to merge patches and waited for CI to finish. Personally I'd say that current status is OK, because it's a responsibility of a maintainer to check CI results of a patch that he/she merged (and if error is raised then investigate the issue and post a fix asap if needed). So "Fast Forward Only" could successfully works for smaller projects, but I don't think it will work for big projects like engine or vdsm. +1 for me > > > > > -- > > Barak Korren > > bkor...@redhat.com > > RHEV-CI Team > > ___ > > Devel mailing list > > Devel@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/devel > > ___ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel > ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] Merge gating in Gerrit
Il 20/Nov/2016 17:25, "Nir Soffer" ha scritto: > > On Sun, Nov 20, 2016 at 5:39 PM, Yedidyah Bar David wrote: > > On Sun, Nov 20, 2016 at 5:06 PM, Barak Korren wrote: > >> Hi all, > >> > >> Perhaps the main purpose of CI, is to prevent braking code from > >> getting merged into the stable/master branches. Unfortunately our CI > >> is not there yet, and one of the reasons for that is that we do large > >> amount of our CI tests only _after_ the code is merged. > >> > >> The reason for that is that when balancing through, but time > >> consuming, tests (e.g. enging build with all permutations) v.s. faster > >> but more basic ones (e.g. "findbugs" and single permutation build), we > >> typically choose the faster tests to be run per-patch-set and leave > >> the through testing to only be run post-merge. > >> > >> We'd like to change that and have the through tests also run before > >> merge. Ideally we would like to just hook stuff to the "submit" > >> button, but Gerrit doesn't allow one to do that easily. So instead > >> we'll need to adopt some kind of flag to indicate we want to submit > >> and have Jenkins > >> "click" the submit button on our behalf if tests pass. > >> > >> I see two options here: > >> 1. Use Code-Review+2 as the indicator to run "heavy" CI and merge. > > This is problematic. For example in vdsm we have 5 maintainers with > +2, and 4 maintainers with commit right, but only 2 are commenting > regularly. > > >> 2. Add an "approve" flag that maintainers can set to +1 (This is > >>what OpenStack is doing). > > This seems better. > > But there is another requirement - maintainer should be able to commit > even if jenkins fails. Sometimes the CI is broken, or there are flakey tests > breaking the build, and some jobs are failing regularly (check-merged) > and I don't want to wait for it. Either disable the jobs or fix them. Having jobs consitently failing and just ignore them is just a waste of resources. > > Today we can override the CI vote and commit, if we keep it as is I don't > see any problem with this change. > > Nir > ___ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] Merge gating in Gerrit
Il 20/Nov/2016 16:06, "Barak Korren" ha scritto: > > Hi all, > > Perhaps the main purpose of CI, is to prevent braking code from > getting merged into the stable/master branches. Unfortunately our CI > is not there yet, and one of the reasons for that is that we do large > amount of our CI tests only _after_ the code is merged. > > The reason for that is that when balancing through, but time > consuming, tests (e.g. enging build with all permutations) v.s. faster > but more basic ones (e.g. "findbugs" and single permutation build), we > typically choose the faster tests to be run per-patch-set and leave > the through testing to only be run post-merge. > > We'd like to change that and have the through tests also run before > merge. Hopefully not the same tests ☺ Ideally we would like to just hook stuff to the "submit" > button, but Gerrit doesn't allow one to do that easily. So instead > we'll need to adopt some kind of flag to indicate we want to submit > and have Jenkins > "click" the submit button on our behalf if tests pass. > > I see two options here: > 1. Use Code-Review+2 as the indicator to run "heavy" CI and merge. > 2. Add an "approve" flag that maintainers can set to +1 (This is >what OpenStack is doing). > > What would you prefer? I would prefer to follow openstack example. Will help developers to have same flow in both projects. > > -- > Barak Korren > bkor...@redhat.com > RHEV-CI Team > ___ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] Gerrit parallel patch handling and CI (Or, why did my code fail post-merge)
Il 20/Nov/2016 15:08, "Barak Korren" ha scritto: > > Hi there, > > I would like to address a concernt that had been raised to us by > multiple developers, and reach an agreement on how (and if) to remedy > it. > > Lets assume the following situation: > We have a Git repo in Gerrit with top commit C0 in master. > On time t0 developers Alice and Bob push patches P1 and P2 respectively > to master so that we end up with the following situation in git: > C0 <= P1 (this is Alice`s patch) > C0 <= P2 (this is Bob`s patch) > > On time t1 CI runs for both patches checking the code as it looks for > each patch. Lets assume CI is successful for both. > > On time t2 Alice submits her patch and Gerrit merges it, resulting in > the following situation in master: > C0 <= P1 > > On time t2 Bob submits his patch. Gerrit, seeing master has changed, > re-bases the patch and merges it, the resulting situation (If the > rebase is successful) is: > C0 <= P1 <= P2 > > This means that the resulting code was never tested in CI. This, in > turn, causes various failures to show up post-merge despite having > pre-merge CI run successfully. > > This situation is a result of the way our repos are currently > configured. Most repos ATM are configured with the "Rebase If > Necessary" submit type. This means that Gerrit tries to automatically > rebase patches as mentioned in t2 above. > > We could, instead, configure the repos to use the "Fast Forward Only" > submit type. In that case, when Bob submits on t2, Gerrit refuses to > merge and asks Bob to rebase (While offering a convenient button to do > it). When he does, a new patch set gets pushed, and subsequently > checked by CI. > > I recommend we switch all projects to use the "Fast Forward Only" submit type. > > Thoughts? Concerns? +1 for me > > -- > Barak Korren > bkor...@redhat.com > RHEV-CI Team > ___ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] system tests failing on template export
On Nov 20, 2016 6:33 PM, "Nir Soffer" wrote: > > On Sun, Nov 20, 2016 at 6:25 PM, Eyal Edri wrote: > > It happened again in [1] > > > > 2016-11-20 10:48:12,106 ERROR (jsonrpc/2) [storage.TaskManager.Task] > > (Task='6c1ec6e7-fb37-465b-8e30-1613317683b2') Unexpected error (task:870) > > Traceback (most recent call last): > > File "/usr/share/vdsm/storage/task.py", line 877, in _run > > return fn(*args, **kargs) > > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in > > wrapper > > res = f(*args, **kwargs) > > File "/usr/share/vdsm/storage/hsm.py", line 2205, in getAllTasksInfo > > allTasksInfo = sp.getAllTasksInfo() > > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line > > 77, in wrapper > > raise SecureError("Secured object is not in safe state") > > SecureError: Secured object is not in safe state > > 2016-11-20 10:48:12,109 INFO (jsonrpc/2) [storage.TaskManager.Task] > > (Task='6c1ec6e7-fb37-465b-8e30-1613317683b2') aborting: Task is aborted: > > u'Secured object is not in safe state' - code 100 (task:1175) > > 2016-11-20 10:48:12,110 ERROR (jsonrpc/2) [storage.Dispatcher] Secured > > object is not in safe state (dispatcher:80) > > Traceback (most recent call last): > > File "/usr/share/vdsm/storage/dispatcher.py", line 72, in wrapper > > result = ctask.prepare(func, *args, **kwargs) > > File "/usr/share/vdsm/storage/task.py", line 105, in wrapper > > return m(self, *a, **kw) > > File "/usr/share/vdsm/storage/task.py", line 1183, in prepare > > raise self.error > > SecureError: Secured object is not in safe state > > This can also mean that the SPM is not started yet. Maybe you are not > waiting until the SPM is ready before you try to perform an operation? > > Who is the owner of this test? This person should debug this test. The relevant team for the feature. > > > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/3506/artifact/exported-artifacts/basic_suite_master.sh-el7/exported-artifacts/test_logs/basic-suite-master/post-006_network_by_label.py/lago-basic-suite-master-host1/_var_log_vdsm/vdsm.log > > > > The storage VM is running on the same VM as engine ( to save memory ) and > > its serving both NFS & ISCSI. > > Do you think running it on the same VM as engine might cause such issues? > > I don't think so, but this prevents testing lot of interesting negative flows. Which don't belong to CI. > > For example, when one storage server is down, the system should be > able to use the other storage domain. Having each storage server in > its own vm makes this possible. You have both NFS and ISCSI there. It's trival to set multiple of each if needed, of course. I do wish to add more IPs and test iSCSI bonding as well as both NFSv3 and NFSv4. > > Also, we may like to test multiple storage servers of same type. > the storage servers should be decoupled so we can start any number > of them as needed for the current test. Right, but not on this suite. Again, it's trivial to do so. The main motivation was to conserve resources so everyone could run the tests. Y. > > > On Mon, Oct 17, 2016 at 11:45 PM, Adam Litke wrote: > >> > >> On 17/10/16 11:51 +0200, Piotr Kliczewski wrote: > >>> > >>> Adam, > >>> > >>> I see constant failures due to this and found: > >>> > >>> 2016-10-17 03:55:21,045 ERROR (jsonrpc/3) [storage.TaskManager.Task] > >>> Task=`8989d694-7099-449b-bd66-4d63786be089`::Unexpected error > >>> (task:870) > >>> Traceback (most recent call last): > >>> File "/usr/share/vdsm/storage/task.py", line 877, in _run > >>>return fn(*args, **kargs) > >>> File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in > >>> wrapper > >>>res = f(*args, **kwargs) > >>> File "/usr/share/vdsm/storage/hsm.py", line 2212, in getAllTasksInfo > >>>allTasksInfo = sp.getAllTasksInfo() > >>> File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", > >>> line 77, in wrapper > >>>raise SecureError("Secured object is not in safe state") > >>> SecureError: Secured object is not in safe state > >> > >> > >> This usually indicates that the SPM role has been lost which happens > >> most likely due to connection issues with the storage. What is the > >> storage environment being used for the system tests? > >> > >>> > >>> Please take a look not sure whether it is related. You can find latest > >>> build here [1] > >>> > >>> Thanks, > >>> Piotr > >>> > >>> [1] http://jenkins.ovirt.org/job/ovirt_master_system-tests/668/ > >>> > >>> On Fri, Oct 14, 2016 at 11:22 AM, Evgheni Dereveanchin > >>> wrote: > > Hello, > > We've got several cases today where system tests failed > when attempting to export templates: > > > http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/testReport/junit/(root)/004_basic_sanity/template_export/ > > Related engine.log looks something like this: > https://paste.fedoraproject.org/449936/47643643/raw
Re: [ovirt-devel] Failures in OST (4.0/master) ( was error msg from Jenkins )
On Nov 20, 2016 6:30 PM, "Eyal Edri" wrote: > > Renaming title and adding devel. > > On Sun, Nov 20, 2016 at 2:36 PM, Piotr Kliczewski wrote: >> >> The last failure seems to be storage related. >> >> @Nir please take a look. >> >> Here is engine side error: >> >> 2016-11-20 05:54:59,605 DEBUG [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (default task-5) [59fc0074] Exception: org.ovirt.engine.core.vdsbroker.irsbroker.IRSNoMasterDomainException: IRSGenericException: IRSErrorException: IRSNoMasterDomainException: Cannot find master domain: u'spUUID=1ca141f1-b64d-4a52-8861-05c7de2a72b2, msdUUID=7d4bf750-4fb8-463f-bbb0-92156c47306e' >> >> and here is vdsm: >> >> jsonrpc.Executor/5::ERROR::2016-11-20 05:54:56,331::multipath::95::Storage.Multipath::(resize_devices) Could not resize device 360014052749733c7b8248628637b990f >> Traceback (most recent call last): >> File "/usr/share/vdsm/storage/multipath.py", line 93, in resize_devices >> _resize_if_needed(guid) >> File "/usr/share/vdsm/storage/multipath.py", line 101, in _resize_if_needed >> for slave in devicemapper.getSlaves(name)] >> File "/usr/share/vdsm/storage/multipath.py", line 158, in getDeviceSize >> bs, phyBs = getDeviceBlockSizes(devName) >> File "/usr/share/vdsm/storage/multipath.py", line 150, in getDeviceBlockSizes >> "queue", "logical_block_size")).read()) >> IOError: [Errno 2] No such file or directory: '/sys/block/sdb/queue/logical_block_size' > > > > We now see a different error in master [1], which also indicates the hosts are in a problematic state: ( failing 'assign_hosts_network_label' test ) > > status: 409 > reason: Conflict > detail: Cannot add Label. Operation can be performed only when Host status is Maintenance, Up, NonOperational. I believe you are mixing unrelated issues. I've seen this once and I have an unproven theory : The previous suite restarts Engine after LDAP configuration then performs its test, which is quite short (24 seconds on my poor laptop + few additional secs between suites). I'm not convinced it is enough time for hosts status to be updated in Engine back to UP state. Y. > >> begin captured logging << > > > [1] http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/3506/testReport/junit/(root)/006_network_by_label/assign_hosts_network_label/ > > >> >> >> >> On Sun, Nov 20, 2016 at 12:50 PM, Eyal Edri wrote: >>> >>> >>> >>> On Sun, Nov 20, 2016 at 1:42 PM, Yaniv Kaul wrote: On Sun, Nov 20, 2016 at 1:30 PM, Yaniv Kaul wrote: > > > > On Sun, Nov 20, 2016 at 1:18 PM, Eyal Edri wrote: >> >> the test fails to run VM because no hosts are in UP state(?) [1], not sure it is related to the triggering patch[2] >> >> status: 400 >> reason: Bad Request >> detail: There are no hosts to use. Check that the cluster contains at least one host in Up state. >> >> Thoughts? Shouldn't we fail the test earlier we hosts are not UP? > > > Yes. It's more likely that we are picking the wrong host or so, but who knows - where are the engine and VDSM logs? A simple grep on the engine.log[1] finds serveral unrelated issues I'm not sure are reported, it's despairing to even begin... That being said, I don't see the issue there. We may need better logging on the API level, to see what is being sent. Is it consistent? >>> >>> >>> Just failed now the first time, I didn't see it before. >>> Y. [1] http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_4.0/3015/artifact/exported-artifacts/basic_suite_4.0.sh-el7/exported-artifacts/test_logs/basic-suite-4.0/post-004_basic_sanity.py/lago-basic-suite-4-0-engine/_var_log_ovirt-engine/engine.log > > Y. > >> >> >> >> [1] http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_4.0/3015/testReport/junit/(root)/004_basic_sanity/vm_run/ >> [2] http://jenkins.ovirt.org/job/ovirt-engine_4.0_build-artifacts-el7-x86_64/1535/changes#detail >> >> >> >> On Sun, Nov 20, 2016 at 1:00 PM, wrote: >>> >>> Build: http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_4.0/3015/, >>> Build Number: 3015, >>> Build Status: FAILURE >>> ___ >>> Infra mailing list >>> in...@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/infra >>> >> >> >> >> -- >> Eyal Edri >> Associate Manager >> RHV DevOps >> EMEA ENG Virtualization R&D >> Red Hat Israel >> >> phone: +972-9-7692018 >> irc: eedri (on #tlv #rhev-dev #rhev-integ) > > >>> >>> >>> >>> -- >>> Eyal Edri >>> Associate Manager >>> RHV DevOps >>> EMEA ENG Virtualization R&D >>> Red Hat Israel >>> >>> phone: +972-9-7692018 >>> irc: eedri (on #tlv #rhev-dev #rhev-integ) >> >> > > > > -- > Eyal Edri > Associate Manager > RHV DevOps > EMEA ENG Virtualizat
Re: [ovirt-devel] Failures in OST (4.0/master) ( was error msg from Jenkins )
On Sun, Nov 20, 2016 at 6:30 PM, Eyal Edri wrote: > Renaming title and adding devel. > > On Sun, Nov 20, 2016 at 2:36 PM, Piotr Kliczewski > wrote: >> >> The last failure seems to be storage related. >> >> @Nir please take a look. >> >> Here is engine side error: >> >> 2016-11-20 05:54:59,605 DEBUG >> [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] >> (default task-5) [59fc0074] Exception: >> org.ovirt.engine.core.vdsbroker.irsbroker.IRSNoMasterDomainException: >> IRSGenericException: IRSErrorException: IRSNoMasterDomainException: Cannot >> find master domain: u'spUUID=1ca141f1-b64d-4a52-8861-05c7de2a72b2, >> msdUUID=7d4bf750-4fb8-463f-bbb0-92156c47306e' >> >> and here is vdsm: >> >> jsonrpc.Executor/5::ERROR::2016-11-20 >> 05:54:56,331::multipath::95::Storage.Multipath::(resize_devices) Could not >> resize device 360014052749733c7b8248628637b990f >> Traceback (most recent call last): >> File "/usr/share/vdsm/storage/multipath.py", line 93, in resize_devices >> _resize_if_needed(guid) >> File "/usr/share/vdsm/storage/multipath.py", line 101, in >> _resize_if_needed >> for slave in devicemapper.getSlaves(name)] >> File "/usr/share/vdsm/storage/multipath.py", line 158, in getDeviceSize >> bs, phyBs = getDeviceBlockSizes(devName) >> File "/usr/share/vdsm/storage/multipath.py", line 150, in >> getDeviceBlockSizes >> "queue", "logical_block_size")).read()) >> IOError: [Errno 2] No such file or directory: >> '/sys/block/sdb/queue/logical_block_size' Please open a bug for this, this is an expected situation (when device is during a scan), and we should be able to cope with it. Adding Fred who worked on this area. Nir > We now see a different error in master [1], which also indicates the hosts > are in a problematic state: ( failing 'assign_hosts_network_label' test ) > > status: 409 > reason: Conflict > detail: Cannot add Label. Operation can be performed only when Host status > is Maintenance, Up, NonOperational. > >> begin captured logging << > > > [1] > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/3506/testReport/junit/(root)/006_network_by_label/assign_hosts_network_label/ > > >> >> >> >> On Sun, Nov 20, 2016 at 12:50 PM, Eyal Edri wrote: >>> >>> >>> >>> On Sun, Nov 20, 2016 at 1:42 PM, Yaniv Kaul wrote: On Sun, Nov 20, 2016 at 1:30 PM, Yaniv Kaul wrote: > > > > On Sun, Nov 20, 2016 at 1:18 PM, Eyal Edri wrote: >> >> the test fails to run VM because no hosts are in UP state(?) [1], not >> sure it is related to the triggering patch[2] >> >> status: 400 >> reason: Bad Request >> detail: There are no hosts to use. Check that the cluster contains at >> least one host in Up state. >> >> Thoughts? Shouldn't we fail the test earlier we hosts are not UP? > > > Yes. It's more likely that we are picking the wrong host or so, but who > knows - where are the engine and VDSM logs? A simple grep on the engine.log[1] finds serveral unrelated issues I'm not sure are reported, it's despairing to even begin... That being said, I don't see the issue there. We may need better logging on the API level, to see what is being sent. Is it consistent? >>> >>> >>> Just failed now the first time, I didn't see it before. >>> Y. [1] http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_4.0/3015/artifact/exported-artifacts/basic_suite_4.0.sh-el7/exported-artifacts/test_logs/basic-suite-4.0/post-004_basic_sanity.py/lago-basic-suite-4-0-engine/_var_log_ovirt-engine/engine.log > > Y. > >> >> >> >> [1] >> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_4.0/3015/testReport/junit/(root)/004_basic_sanity/vm_run/ >> [2] >> http://jenkins.ovirt.org/job/ovirt-engine_4.0_build-artifacts-el7-x86_64/1535/changes#detail >> >> >> >> On Sun, Nov 20, 2016 at 1:00 PM, >> wrote: >>> >>> Build: >>> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_4.0/3015/, >>> Build Number: 3015, >>> Build Status: FAILURE >>> ___ >>> Infra mailing list >>> in...@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/infra >>> >> >> >> >> -- >> Eyal Edri >> Associate Manager >> RHV DevOps >> EMEA ENG Virtualization R&D >> Red Hat Israel >> >> phone: +972-9-7692018 >> irc: eedri (on #tlv #rhev-dev #rhev-integ) > > >>> >>> >>> >>> -- >>> Eyal Edri >>> Associate Manager >>> RHV DevOps >>> EMEA ENG Virtualization R&D >>> Red Hat Israel >>> >>> phone: +972-9-7692018 >>> irc: eedri (on #tlv #rhev-dev #rhev-integ) >> >> > > > > -- > Eyal Edri > Associate Manager > RHV DevOps > EMEA ENG Virtualization R&D > Red Hat Israel > > phone: +972-9-7692018 > irc: eedri (on #tlv #rhev-dev #rhev-integ
Re: [ovirt-devel] system tests failing on template export
On Sun, Nov 20, 2016 at 6:25 PM, Eyal Edri wrote: > It happened again in [1] > > 2016-11-20 10:48:12,106 ERROR (jsonrpc/2) [storage.TaskManager.Task] > (Task='6c1ec6e7-fb37-465b-8e30-1613317683b2') Unexpected error (task:870) > Traceback (most recent call last): > File "/usr/share/vdsm/storage/task.py", line 877, in _run > return fn(*args, **kargs) > File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in > wrapper > res = f(*args, **kwargs) > File "/usr/share/vdsm/storage/hsm.py", line 2205, in getAllTasksInfo > allTasksInfo = sp.getAllTasksInfo() > File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line > 77, in wrapper > raise SecureError("Secured object is not in safe state") > SecureError: Secured object is not in safe state > 2016-11-20 10:48:12,109 INFO (jsonrpc/2) [storage.TaskManager.Task] > (Task='6c1ec6e7-fb37-465b-8e30-1613317683b2') aborting: Task is aborted: > u'Secured object is not in safe state' - code 100 (task:1175) > 2016-11-20 10:48:12,110 ERROR (jsonrpc/2) [storage.Dispatcher] Secured > object is not in safe state (dispatcher:80) > Traceback (most recent call last): > File "/usr/share/vdsm/storage/dispatcher.py", line 72, in wrapper > result = ctask.prepare(func, *args, **kwargs) > File "/usr/share/vdsm/storage/task.py", line 105, in wrapper > return m(self, *a, **kw) > File "/usr/share/vdsm/storage/task.py", line 1183, in prepare > raise self.error > SecureError: Secured object is not in safe state This can also mean that the SPM is not started yet. Maybe you are not waiting until the SPM is ready before you try to perform an operation? Who is the owner of this test? This person should debug this test. > http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/3506/artifact/exported-artifacts/basic_suite_master.sh-el7/exported-artifacts/test_logs/basic-suite-master/post-006_network_by_label.py/lago-basic-suite-master-host1/_var_log_vdsm/vdsm.log > > The storage VM is running on the same VM as engine ( to save memory ) and > its serving both NFS & ISCSI. > Do you think running it on the same VM as engine might cause such issues? I don't think so, but this prevents testing lot of interesting negative flows. For example, when one storage server is down, the system should be able to use the other storage domain. Having each storage server in its own vm makes this possible. Also, we may like to test multiple storage servers of same type. the storage servers should be decoupled so we can start any number of them as needed for the current test. > On Mon, Oct 17, 2016 at 11:45 PM, Adam Litke wrote: >> >> On 17/10/16 11:51 +0200, Piotr Kliczewski wrote: >>> >>> Adam, >>> >>> I see constant failures due to this and found: >>> >>> 2016-10-17 03:55:21,045 ERROR (jsonrpc/3) [storage.TaskManager.Task] >>> Task=`8989d694-7099-449b-bd66-4d63786be089`::Unexpected error >>> (task:870) >>> Traceback (most recent call last): >>> File "/usr/share/vdsm/storage/task.py", line 877, in _run >>>return fn(*args, **kargs) >>> File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in >>> wrapper >>>res = f(*args, **kwargs) >>> File "/usr/share/vdsm/storage/hsm.py", line 2212, in getAllTasksInfo >>>allTasksInfo = sp.getAllTasksInfo() >>> File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >>> line 77, in wrapper >>>raise SecureError("Secured object is not in safe state") >>> SecureError: Secured object is not in safe state >> >> >> This usually indicates that the SPM role has been lost which happens >> most likely due to connection issues with the storage. What is the >> storage environment being used for the system tests? >> >>> >>> Please take a look not sure whether it is related. You can find latest >>> build here [1] >>> >>> Thanks, >>> Piotr >>> >>> [1] http://jenkins.ovirt.org/job/ovirt_master_system-tests/668/ >>> >>> On Fri, Oct 14, 2016 at 11:22 AM, Evgheni Dereveanchin >>> wrote: Hello, We've got several cases today where system tests failed when attempting to export templates: http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/testReport/junit/(root)/004_basic_sanity/template_export/ Related engine.log looks something like this: https://paste.fedoraproject.org/449936/47643643/raw/ I could not find any obvious issues in SPM logs, could someone please take a look to confirm what may be causing this issue? Full logs from the test are available here: http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/artifact/ Regards, Evgheni Dereveanchin ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel >> >> >> -- >> Adam Litke >> >> ___ >> Devel mailing list >> Devel@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] Failures in OST (4.0/master) ( was error msg from Jenkins )
Renaming title and adding devel. On Sun, Nov 20, 2016 at 2:36 PM, Piotr Kliczewski wrote: > The last failure seems to be storage related. > > @Nir please take a look. > > Here is engine side error: > > 2016-11-20 05:54:59,605 DEBUG > [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] > (default task-5) [59fc0074] Exception: org.ovirt.engine.core. > vdsbroker.irsbroker.IRSNoMasterDomainException: IRSGenericException: > IRSErrorException: IRSNoMasterDomainException: Cannot find master domain: > u'spUUID=1ca141f1-b64d-4a52-8861-05c7de2a72b2, msdUUID=7d4bf750-4fb8-463f- > bbb0-92156c47306e' > > and here is vdsm: > > jsonrpc.Executor/5::ERROR::2016-11-20 05:54:56,331::multipath::95:: > Storage.Multipath::(resize_devices) Could not resize device > 360014052749733c7b8248628637b990f > Traceback (most recent call last): > File "/usr/share/vdsm/storage/multipath.py", line 93, in resize_devices > _resize_if_needed(guid) > File "/usr/share/vdsm/storage/multipath.py", line 101, in > _resize_if_needed > for slave in devicemapper.getSlaves(name)] > File "/usr/share/vdsm/storage/multipath.py", line 158, in getDeviceSize > bs, phyBs = getDeviceBlockSizes(devName) > File "/usr/share/vdsm/storage/multipath.py", line 150, in > getDeviceBlockSizes > "queue", "logical_block_size")).read()) > IOError: [Errno 2] No such file or directory: > '/sys/block/sdb/queue/logical_block_size' > We now see a different error in master [1], which also indicates the hosts are in a problematic state: ( failing 'assign_hosts_network_label' test ) status: 409 reason: Conflict detail: Cannot add Label. Operation can be performed only when Host status is Maintenance, Up, NonOperational. >> begin captured logging << [1] http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/3506/testReport/junit/(root)/006_network_by_label/assign_hosts_network_label/ > > > On Sun, Nov 20, 2016 at 12:50 PM, Eyal Edri wrote: > >> >> >> On Sun, Nov 20, 2016 at 1:42 PM, Yaniv Kaul wrote: >> >>> >>> >>> On Sun, Nov 20, 2016 at 1:30 PM, Yaniv Kaul wrote: >>> On Sun, Nov 20, 2016 at 1:18 PM, Eyal Edri wrote: > the test fails to run VM because no hosts are in UP state(?) [1], not > sure it is related to the triggering patch[2] > > status: 400 > reason: Bad Request > detail: There are no hosts to use. Check that the cluster contains at > least one host in Up state. > > Thoughts? Shouldn't we fail the test earlier we hosts are not UP? > Yes. It's more likely that we are picking the wrong host or so, but who knows - where are the engine and VDSM logs? >>> >>> A simple grep on the engine.log[1] finds serveral unrelated issues I'm >>> not sure are reported, it's despairing to even begin... >>> That being said, I don't see the issue there. We may need better logging >>> on the API level, to see what is being sent. Is it consistent? >>> >> >> Just failed now the first time, I didn't see it before. >> >> >>> Y. >>> >>> >>> [1] http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_4. >>> 0/3015/artifact/exported-artifacts/basic_suite_4.0.sh-el7/ex >>> ported-artifacts/test_logs/basic-suite-4.0/post-004_basic_ >>> sanity.py/lago-basic-suite-4-0-engine/_var_log_ovirt-engine/engine.log >>> Y. > > > [1] http://jenkins.ovirt.org/job/test-repo_ovirt_experimenta > l_4.0/3015/testReport/junit/(root)/004_basic_sanity/vm_run/ > [2] http://jenkins.ovirt.org/job/ovirt-engine_4.0_build-arti > facts-el7-x86_64/1535/changes#detail > > > > On Sun, Nov 20, 2016 at 1:00 PM, > wrote: > >> Build: http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_4. >> 0/3015/, >> Build Number: 3015, >> Build Status: FAILURE >> ___ >> Infra mailing list >> in...@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/infra >> >> > > > -- > Eyal Edri > Associate Manager > RHV DevOps > EMEA ENG Virtualization R&D > Red Hat Israel > > phone: +972-9-7692018 > irc: eedri (on #tlv #rhev-dev #rhev-integ) > >>> >> >> >> -- >> Eyal Edri >> Associate Manager >> RHV DevOps >> EMEA ENG Virtualization R&D >> Red Hat Israel >> >> phone: +972-9-7692018 >> irc: eedri (on #tlv #rhev-dev #rhev-integ) >> > > -- Eyal Edri Associate Manager RHV DevOps EMEA ENG Virtualization R&D Red Hat Israel phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ) ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] system tests failing on template export
It happened again in [1] 1. 2016-11-20 10:48:12,106 ERROR (jsonrpc/2) [storage.TaskManager.Task] (Task='6c1ec6e7-fb37-465b-8e30-1613317683b2') Unexpected error (task:870) 2. Traceback (most recent call last): 3. File "/usr/share/vdsm/storage/task.py", line 877, in _run 4. return fn(*args, **kargs) 5. File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper 6. res = f(*args, **kwargs) 7. File "/usr/share/vdsm/storage/hsm.py", line 2205, in getAllTasksInfo 8. allTasksInfo = sp.getAllTasksInfo() 9. File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 77, in wrapper 10. raise SecureError("Secured object is not in safe state") 11. SecureError: Secured object is not in safe state 12. 2016-11-20 10:48:12,109 INFO (jsonrpc/2) [storage.TaskManager.Task] (Task='6c1ec6e7-fb37-465b-8e30-1613317683b2') aborting: Task is aborted: u'Secured object is not in safe state' - code 100 (task:1175) 13. 2016-11-20 10:48:12,110 ERROR (jsonrpc/2) [storage.Dispatcher] Secured object is not in safe state (dispatcher:80) 14. Traceback (most recent call last): 15. File "/usr/share/vdsm/storage/dispatcher.py", line 72, in wrapper 16. result = ctask.prepare(func, *args, **kwargs) 17. File "/usr/share/vdsm/storage/task.py", line 105, in wrapper 18. return m(self, *a, **kw) 19. File "/usr/share/vdsm/storage/task.py", line 1183, in prepare 20. raise self.error 21. SecureError: Secured object is not in safe state http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/3506/artifact/exported-artifacts/basic_suite_master.sh-el7/exported-artifacts/test_logs/basic-suite-master/post-006_network_by_label.py/lago-basic-suite-master-host1/_var_log_vdsm/vdsm.log The storage VM is running on the same VM as engine ( to save memory ) and its serving both NFS & ISCSI. Do you think running it on the same VM as engine might cause such issues? On Mon, Oct 17, 2016 at 11:45 PM, Adam Litke wrote: > On 17/10/16 11:51 +0200, Piotr Kliczewski wrote: > >> Adam, >> >> I see constant failures due to this and found: >> >> 2016-10-17 03:55:21,045 ERROR (jsonrpc/3) [storage.TaskManager.Task] >> Task=`8989d694-7099-449b-bd66-4d63786be089`::Unexpected error >> (task:870) >> Traceback (most recent call last): >> File "/usr/share/vdsm/storage/task.py", line 877, in _run >>return fn(*args, **kargs) >> File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in >> wrapper >>res = f(*args, **kwargs) >> File "/usr/share/vdsm/storage/hsm.py", line 2212, in getAllTasksInfo >>allTasksInfo = sp.getAllTasksInfo() >> File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", >> line 77, in wrapper >>raise SecureError("Secured object is not in safe state") >> SecureError: Secured object is not in safe state >> > > This usually indicates that the SPM role has been lost which happens > most likely due to connection issues with the storage. What is the > storage environment being used for the system tests? > > >> Please take a look not sure whether it is related. You can find latest >> build here [1] >> >> Thanks, >> Piotr >> >> [1] http://jenkins.ovirt.org/job/ovirt_master_system-tests/668/ >> >> On Fri, Oct 14, 2016 at 11:22 AM, Evgheni Dereveanchin >> wrote: >> >>> Hello, >>> >>> We've got several cases today where system tests failed >>> when attempting to export templates: >>> >>> http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/t >>> estReport/junit/(root)/004_basic_sanity/template_export/ >>> >>> Related engine.log looks something like this: >>> https://paste.fedoraproject.org/449936/47643643/raw/ >>> >>> I could not find any obvious issues in SPM logs, could someone >>> please take a look to confirm what may be causing this issue? >>> >>> Full logs from the test are available here: >>> http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/artifact/ >>> >>> Regards, >>> Evgheni Dereveanchin >>> ___ >>> Devel mailing list >>> Devel@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/devel >>> >> > -- > Adam Litke > > ___ > Devel mailing list > Devel@ovirt.org > http://lists.ovirt.org/mailman/listinfo/devel > > > -- Eyal Edri Associate Manager RHV DevOps EMEA ENG Virtualization R&D Red Hat Israel phone: +972-9-7692018 irc: eedri (on #tlv #rhev-dev #rhev-integ) ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] Merge gating in Gerrit
On Sun, Nov 20, 2016 at 5:39 PM, Yedidyah Bar David wrote: > On Sun, Nov 20, 2016 at 5:06 PM, Barak Korren wrote: >> Hi all, >> >> Perhaps the main purpose of CI, is to prevent braking code from >> getting merged into the stable/master branches. Unfortunately our CI >> is not there yet, and one of the reasons for that is that we do large >> amount of our CI tests only _after_ the code is merged. >> >> The reason for that is that when balancing through, but time >> consuming, tests (e.g. enging build with all permutations) v.s. faster >> but more basic ones (e.g. "findbugs" and single permutation build), we >> typically choose the faster tests to be run per-patch-set and leave >> the through testing to only be run post-merge. >> >> We'd like to change that and have the through tests also run before >> merge. Ideally we would like to just hook stuff to the "submit" >> button, but Gerrit doesn't allow one to do that easily. So instead >> we'll need to adopt some kind of flag to indicate we want to submit >> and have Jenkins >> "click" the submit button on our behalf if tests pass. >> >> I see two options here: >> 1. Use Code-Review+2 as the indicator to run "heavy" CI and merge. This is problematic. For example in vdsm we have 5 maintainers with +2, and 4 maintainers with commit right, but only 2 are commenting regularly. >> 2. Add an "approve" flag that maintainers can set to +1 (This is >>what OpenStack is doing). This seems better. But there is another requirement - maintainer should be able to commit even if jenkins fails. Sometimes the CI is broken, or there are flakey tests breaking the build, and some jobs are failing regularly (check-merged) and I don't want to wait for it. Today we can override the CI vote and commit, if we keep it as is I don't see any problem with this change. Nir ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] Gerrit parallel patch handling and CI (Or, why did my code fail post-merge)
On Sun, Nov 20, 2016 at 5:12 PM, Barak Korren wrote: >> With the current setting (in vdsm), submitting a series of patches is >> a huge pain. Sometimes refreshing the page and submitting the next >> patch in the series works, but sometimes you have to rebase again >> the next patches in the series, and in the worst cases, you have to >> do several rebases in the same series. This when the entire series >> was already rebased properly before the submit. > > Actually vdsm is configured to "Cherry Pick" ATM, I'm not sure what > were the reasons for this, but this should probably be changed to > ff-only ASAP b/c as it is, it allows patches to be submitted > completely out-of-order. > >> In vdsm we were bitten by this many times, and both Dan and me agree >> now that fast-forward is the only way. >> >> I don't think we need to agree on all projects for this, the whole point >> of having multiple project is that we don't to agree on every little >> detail, the project maintainer can do whatever they want. > > Ok, so can we get an agreement between the vdsm maintainers to change > to "ff-only"? +1 Dan, can you confirm? ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] Merge gating in Gerrit
On Sun, Nov 20, 2016 at 5:06 PM, Barak Korren wrote: > Hi all, > > Perhaps the main purpose of CI, is to prevent braking code from > getting merged into the stable/master branches. Unfortunately our CI > is not there yet, and one of the reasons for that is that we do large > amount of our CI tests only _after_ the code is merged. > > The reason for that is that when balancing through, but time > consuming, tests (e.g. enging build with all permutations) v.s. faster > but more basic ones (e.g. "findbugs" and single permutation build), we > typically choose the faster tests to be run per-patch-set and leave > the through testing to only be run post-merge. > > We'd like to change that and have the through tests also run before > merge. Ideally we would like to just hook stuff to the "submit" > button, but Gerrit doesn't allow one to do that easily. So instead > we'll need to adopt some kind of flag to indicate we want to submit > and have Jenkins > "click" the submit button on our behalf if tests pass. > > I see two options here: > 1. Use Code-Review+2 as the indicator to run "heavy" CI and merge. > 2. Add an "approve" flag that maintainers can set to +1 (This is >what OpenStack is doing). > > What would you prefer? (2.), and call it "Run heavy CI tests", and only do this and not merge, so that one can ask to run these tests prior to merging. -- Didi ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] Gerrit parallel patch handling and CI (Or, why did my code fail post-merge)
> With the current setting (in vdsm), submitting a series of patches is > a huge pain. Sometimes refreshing the page and submitting the next > patch in the series works, but sometimes you have to rebase again > the next patches in the series, and in the worst cases, you have to > do several rebases in the same series. This when the entire series > was already rebased properly before the submit. Actually vdsm is configured to "Cherry Pick" ATM, I'm not sure what were the reasons for this, but this should probably be changed to ff-only ASAP b/c as it is, it allows patches to be submitted completely out-of-order. > In vdsm we were bitten by this many times, and both Dan and me agree > now that fast-forward is the only way. > > I don't think we need to agree on all projects for this, the whole point > of having multiple project is that we don't to agree on every little > detail, the project maintainer can do whatever they want. Ok, so can we get an agreement between the vdsm maintainers to change to "ff-only"? -- Barak Korren bkor...@redhat.com RHEV-CI Team ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] Merge gating in Gerrit
Hi all, Perhaps the main purpose of CI, is to prevent braking code from getting merged into the stable/master branches. Unfortunately our CI is not there yet, and one of the reasons for that is that we do large amount of our CI tests only _after_ the code is merged. The reason for that is that when balancing through, but time consuming, tests (e.g. enging build with all permutations) v.s. faster but more basic ones (e.g. "findbugs" and single permutation build), we typically choose the faster tests to be run per-patch-set and leave the through testing to only be run post-merge. We'd like to change that and have the through tests also run before merge. Ideally we would like to just hook stuff to the "submit" button, but Gerrit doesn't allow one to do that easily. So instead we'll need to adopt some kind of flag to indicate we want to submit and have Jenkins "click" the submit button on our behalf if tests pass. I see two options here: 1. Use Code-Review+2 as the indicator to run "heavy" CI and merge. 2. Add an "approve" flag that maintainers can set to +1 (This is what OpenStack is doing). What would you prefer? -- Barak Korren bkor...@redhat.com RHEV-CI Team ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] Gerrit parallel patch handling and CI (Or, why did my code fail post-merge)
On Sun, Nov 20, 2016 at 4:07 PM, Barak Korren wrote: > Hi there, > > I would like to address a concernt that had been raised to us by > multiple developers, and reach an agreement on how (and if) to remedy > it. > > Lets assume the following situation: > We have a Git repo in Gerrit with top commit C0 in master. > On time t0 developers Alice and Bob push patches P1 and P2 respectively > to master so that we end up with the following situation in git: > C0 <= P1 (this is Alice`s patch) > C0 <= P2 (this is Bob`s patch) > > On time t1 CI runs for both patches checking the code as it looks for > each patch. Lets assume CI is successful for both. > > On time t2 Alice submits her patch and Gerrit merges it, resulting in > the following situation in master: > C0 <= P1 > > On time t2 Bob submits his patch. Gerrit, seeing master has changed, > re-bases the patch and merges it, the resulting situation (If the > rebase is successful) is: > C0 <= P1 <= P2 > > This means that the resulting code was never tested in CI. This makes the CI useless. To know if a patch actually passed the tests, you have to manually rebase each patch and wait for the CI - this takes up to 20 minutes on vdsm CI. > This, in > turn, causes various failures to show up post-merge despite having > pre-merge CI run successfully. > > This situation is a result of the way our repos are currently > configured. Most repos ATM are configured with the "Rebase If > Necessary" submit type. This means that Gerrit tries to automatically > rebase patches as mentioned in t2 above. > > We could, instead, configure the repos to use the "Fast Forward Only" > submit type. In that case, when Bob submits on t2, Gerrit refuses to > merge and asks Bob to rebase (While offering a convenient button to do > it). When he does, a new patch set gets pushed, and subsequently > checked by CI. > > I recommend we switch all projects to use the "Fast Forward Only" submit type. > > Thoughts? Concerns? We have fast-forward in ioprocess and ovirt-imageio, and we are happy with this setting. Another advantage of fast-forward only merges is being able to submit multiple patches with *one click*. If you submit the top patch in a series, all patches are submitted. With the current setting (in vdsm), submitting a series of patches is a huge pain. Sometimes refreshing the page and submitting the next patch in the series works, but sometimes you have to rebase again the next patches in the series, and in the worst cases, you have to do several rebases in the same series. This when the entire series was already rebased properly before the submit. In vdsm we were bitten by this many times, and both Dan and me agree now that fast-forward is the only way. I don't think we need to agree on all projects for this, the whole point of having multiple project is that we don't to agree on every little detail, the project maintainer can do whatever they want. Thanks for raising this issue. Nir ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
[ovirt-devel] Gerrit parallel patch handling and CI (Or, why did my code fail post-merge)
Hi there, I would like to address a concernt that had been raised to us by multiple developers, and reach an agreement on how (and if) to remedy it. Lets assume the following situation: We have a Git repo in Gerrit with top commit C0 in master. On time t0 developers Alice and Bob push patches P1 and P2 respectively to master so that we end up with the following situation in git: C0 <= P1 (this is Alice`s patch) C0 <= P2 (this is Bob`s patch) On time t1 CI runs for both patches checking the code as it looks for each patch. Lets assume CI is successful for both. On time t2 Alice submits her patch and Gerrit merges it, resulting in the following situation in master: C0 <= P1 On time t2 Bob submits his patch. Gerrit, seeing master has changed, re-bases the patch and merges it, the resulting situation (If the rebase is successful) is: C0 <= P1 <= P2 This means that the resulting code was never tested in CI. This, in turn, causes various failures to show up post-merge despite having pre-merge CI run successfully. This situation is a result of the way our repos are currently configured. Most repos ATM are configured with the "Rebase If Necessary" submit type. This means that Gerrit tries to automatically rebase patches as mentioned in t2 above. We could, instead, configure the repos to use the "Fast Forward Only" submit type. In that case, when Bob submits on t2, Gerrit refuses to merge and asks Bob to rebase (While offering a convenient button to do it). When he does, a new patch set gets pushed, and subsequently checked by CI. I recommend we switch all projects to use the "Fast Forward Only" submit type. Thoughts? Concerns? -- Barak Korren bkor...@redhat.com RHEV-CI Team ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel