[ovirt-devel] Re: [ci test tool] how to start build-artifacts job using jenkins API?

2019-03-07 Thread Nir Soffer
On Wed, Mar 6, 2019 at 12:56 AM Nir Soffer  wrote:

> I'm trying to a write a tool that given a gerrit patch number, will:
> 1. Start build artifacts job (instead of adding "ci please build" comment)\
> 2. Wait until the job is complete
> 3. Start OST with CUSTOM_REPOS using the build artifacts job from step 1
> 4. Wait until OST complets
> 5. Post a comment with the OST link to gerrit
>
> So testing patches will easy as:
>
>$ ci test 12345
>Building rpms...
>Starting OST...
>OST succeeded, congratulations!
>
> I can start OST and watch it, but I could not find a way to start vdsm
> build-artifacts job.
>
> This starts a job:
>
> curl -i \
> --user USERNAME:API_TOKEN \
> -X POST \
> https://jenkins.ovirt.org/job/vdsm_standard-check-patch/build
>
> But the fail quickly in the first "detecting" stage.
>
> https://jenkins.ovirt.org/blue/organizations/jenkins/vdsm_standard-check-patch/detail/vdsm_standard-check-patch/3575/pipeline
>
> Clicking "Build Now" button on the job page fails in the same way.
>
> And this:
>
> curl -i \
> --user USERNAME:API_TOKEN \
> -X POST \
>
> https://jenkins.ovirt.org/job/vdsm_standard-check-patch/buildWithParameters
>
> Fails with 500 Internal server error, with a huge traceback, hiding this
> expected message:
>
> java.lang.IllegalStateException: This build is not parameterized!
>
> Looks like the job need to be parametrized:
>
> - https://jenkins.ovirt.org/job/vdsm_standard-check-patch/configure
> - https://jenkins.ovirt.org/job/ovirt-system-tests_manual/configure
>
> I guess we can workaround this by posting "ci please build" comment on
> gerrit
> using gerrit API, and then waiting waiting an grabbing the build link from
> the comment
> added by jenkins, but I really don't want to go in this direction.
>
> I hope we can have a proper solution in the job configuration.
>

With Barak help I could get the entire flow working
(using the secret standard-manual-runner)

I started this project on github:
https://github.com/nirs/oci

There is no code yet, but the entire flow is documented here:
https://github.com/nirs/oci/blob/master/test.md

Sharing in case someone wants to write this before I find time for this :-)

Nir
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/LLOCXCBN7W464R4JGI43UNGS5T2LAIFZ/


[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master ] [ 07-03-2019 ] [ 008_basic_ui_sanity.initialize_firefox]

2019-03-07 Thread Greg Sheremeta
On Thu, Mar 7, 2019 at 4:58 AM Nir Soffer  wrote:

> Please run OST again to ensure we have a real issue and not a random
> failure.
>

+1, the errors indicate a random network blip while running the test. It's
definitely not related to any patch.

Let me know if rerunning doesn't work.

Greg


>
> On Thu, Mar 7, 2019, 10:08 Marcin Sobczyk 
>> Hi,
>>
>> none of my UI-sanity-tests-optimization patches was merged so this is not
>> related to my recent work.
>>
>> Regards, Marcin
>> On 3/7/19 8:48 AM, Galit Rosenthal wrote:
>>
>> Hi,
>>
>> we are failing basic suite master (vdsm)
>>
>> Can you please have a look at the issue?
>>
>> *Error Message:*
>>
>> Message: Error forwarding the new session new session request for webdriver 
>> should contain a location header or an 'application/json;charset=UTF-8' 
>> response body with the session ID.
>>
>>  *from the [2]:*
>>
>>  *08:22:04* + xmllint --format 
>> /dev/shm/ost/deployment-basic-suite-master/008_basic_ui_sanity.py.junit.xml*08:22:04*
>>  *08:22:04* > name="nosetests" tests="7" errors="1" failures="0" skip="0">*08:22:04*   
>> > time="0.001"/>*08:22:04*   > name="start_grid" time="144.413">*08:22:04* 
>> 

[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master (ovirt-engine-nodejs-modules) ] [ 27-02-2019 ] [ 002_bootstrap.add_vm2_lease ]

2019-03-07 Thread Nir Soffer
On Thu, Mar 7, 2019 at 8:50 PM Dafna Ron  wrote:

> I report this issue to the list and the test owner is more then welcome to
> respond.
> Galit, can you please review the skiptest
> https://gerrit.ovirt.org/#/c/98191/
>

I looked at engine and vdsm logs, and I don't think the add_vm2_lease test
is related
so there is no point to disable it.

It seems that just after finishing resize_and_refresh_storage_domain test:

2019-02-27 13:37:21,036-0500 INFO (jsonrpc/5) [vdsm.api] START
resizePV(sdUUID=u'34311cc1-c4d2-4cfe-88b5-dd5ad72261d3',
spUUID=u'717575a9-7818-460d-ba3a-d5bdd8ef9ed3',
guid=u'360014054e5952dc174c4a12b971ea45c', options=None)
from=:::192.168.201.4,43920,
flow_id=f204cfe2-ef48-429d-ab33-e1175d0530a0,
task_id=5908a867-5e11-474f-8119-49cfe219d4c8 (api:48)
2019-02-27 13:37:21,499-0500 INFO (jsonrpc/5) [vdsm.api] FINISH resizePV
return={'size': '24293408768'} from=:::192.168.201.4,43920,
flow_id=f204cfe2-ef48-429d-ab33-e1175d0530a0,
task_id=5908a867-5e11-474f-8119-49cfe219d4c8 (api:54)

Engine try to lookup the vm lease. The lookup fails (expected since there
is no such lease):

2019-02-27 13:37:21,968-0500 INFO  (jsonrpc/1) [vdsm.api] START
lease_info(lease={u'sd_id': u'b370b91d-00b5-4f62-9270-ac0acd47d075',
u'lease_id': u'3500eb82-e5e2-4e24-b41c-ea02d9f6adee'})
from=:::192.168.201.4,43920,
flow_id=117dec74-ad59-4b12-8148-b2c130337c10,
task_id=9c297d41-0aa7-4c74-b268-b710e666bc6c (api:48)

2019-02-27 13:37:21,987-0500 INFO  (jsonrpc/1) [vdsm.api] FINISH lease_info
error=No such lease 3500eb82-e5e2-4e24-b41c-ea02d9f6adee
from=:::192.168.201.4,43920,
flow_id=117dec74-ad59-4b12-8148-b2c130337c10,
task_id=9c297d41-0aa7-4c74-b268-b710e666bc6c (api:52)

On engine side, we see immediately after that:

2019-02-27 13:37:22,078-05 DEBUG
[org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor]
(default task-1) [117dec74-ad59-4b12-8148-b2c130337c10] method:
getListForSpmSelection, params: [717575a9-7818-460d-ba3a-d5bdd8ef9ed3],
timeElapsed: 12ms

Which means nothing to me, but it seems that engine treat the expected
error to lookup the lease
as fatal error in the spm.

Soon engine try to stop the spm on host-0, which make no sense:

2019-02-27 13:37:22,125-05 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default
task-1) [117dec74-ad59-4b12-8148-b2c130337c10] SpmStopVDSCommand::Stopping
SPM on vds 'lago-basic-suite-master-host-0', pool id
'717575a9-7818-460d-ba3a-d5bdd8ef9ed3'

At the same time, host1 has trouble to see the LUNs, and fail to connect to
the pool:

2019-02-27 13:37:20,014-0500 INFO  (jsonrpc/4) [vdsm.api] START
getDeviceList(storageType=3, guids=[u'360014055ad799e3a968444abdaefa323',
u'360014054e5952dc174c4a12b971ea45c'], checkStatus=False, options={})
from=:::192.168.201.4,33922,
flow_id=f204cfe2-ef48-429d-ab33-e1175d0530a0,
task_id=78fb7d20-8bec-4561-9867-da4174d181a6 (api:48)

2019-02-27 13:37:20,340-0500 INFO  (jsonrpc/4) [vdsm.api] FINISH
getDeviceList return={'devList': []} from=:::192.168.201.4,33922,
flow_id=f204cfe2-ef48-429d-ab33-e1175d
0530a0, task_id=78fb7d20-8bec-4561-9867-da4174d181a6 (api:54)

The host does not see any LUNs.

So of course it cannot see master storage domain.

2019-02-27 13:37:22,718-0500 INFO  (jsonrpc/6) [vdsm.api] FINISH
connectStoragePool error=Cannot find master domain:
u'spUUID=717575a9-7818-460d-ba3a-d5bdd8ef9ed3, msdUUID=3

So we probably go into SPM recovery flow which take some time, and adding
lease is just a victim of that,
since adding leases requires SPM.

We need to answer these questions:

- why host 1 does not see storage? we have seen this many time in the past
- why engine try to stop the spm on a host 0 that has no issue?
  (maybe related to incorrect handling of "no such lease" error?)
- why the previous tests did not detect that host 1 was not connected to
storage?
- why the previous test did not wait until both hosts are connected to
storage?

We can try to increase the timeout of the add lease test, so we can recover
if the SPM is lost
for some reason during this test, but this only hides the real storage
issues.

Since this is not reproducible, and we don't understand the storage
failures, we should run OST
again after failures, and keep failing logs so we can investigate them
later.

Strangely, we don't experience this issue on master - I did 20-30 OST runs
in the last 2 weeks,
and the only issue I have seen is random host_devices failure. Why the
change queue has
move failures? What is the difference between OST master runs and the
change queue?

Nir


>
> Thanks,
> Dafna
>
>
> On Tue, Mar 5, 2019 at 2:43 PM Nir Soffer  wrote:
>
>>
>>
>> On Tue, Mar 5, 2019, 13:27 Eyal Shenitzky >
>>>
>>>
>>> On Tue, Mar 5, 2019 at 12:58 PM Dafna Ron  wrote:
>>>
 Tal,

 I see the bug is in post but the patch was not merged yet:
 https://gerrit.ovirt.org/#/c/98191/

 can you tell me when will we merge the patch? as I rather not add
 SkipTest if this will be merged soon,

[ovirt-devel] Re: OST's basic suite UI sanity tests optimization

2019-03-07 Thread Greg Sheremeta
Marcin,

It just dawned on me that the main reason 008's start_grid takes so long is
that the docker images are fresh pulled every time. Several hundred MB,
every time (ugh, sorry). We can and should cache them. What do you think
about trying this before doing anything else? [it would also be a good time
to update from actinium to the latest, iron.]

@Barak Korren  you once mentioned to me we should cache
these if they are ok to cache (they are). How do we do that?

docker.io/selenium/node-chrome-debug   3.9.1-actinium  327adc897d23
13 months ago   *904 MB*
docker.io/selenium/node-firefox-debug   3.9.1-actinium  88649b420bd5
13 months ago   *814 MB*

Greg


On Tue, Mar 5, 2019 at 6:15 AM Greg Sheremeta  wrote:

>
> On Tue, Mar 5, 2019 at 4:55 AM Marcin Sobczyk  wrote:
>
>> Hi,
>> On 3/4/19 7:07 PM, Greg Sheremeta wrote:
>>
>> Hi,
>>
>> Thanks for trying to improve the tests!
>>
>> I'm reluctant to give up Firefox sanity tests on every commit, though. In
>> fact, I wanted to add Edge and Safari, because those are also supported
>> browsers. Just today a Firefox only issue was reported, so they are
>> valuable.
>>
>> Was the Firefox-only issue detected by basic suite or some other tests?
>>
> It was reported by a developer. Because GWT compiles permutations per
> browser, and each browser therefore loads completely separate JavaScript
> payloads, it's just too easy for it to break in one browser and be fine in
> the other, so I'm really not ok to remove Firefox.
>
> If Admin Portal was React where there is a single JavaScript payload
> that's shared among all browsers, then I'd consider it.
>
>>
>> Did you consider either leaving a grid up permanently or perhaps using a
>> third party like saucelabs?
>>
>> I did consider simply having our own grid for the OST.
>> There's even a thread somewhere on ovirt-devel, where someone found OST
>> trying to connect to one of my VMs in Tel Aviv, where my own grid was
>> running :D
>> I couldn't make a public demo though - OST executors couldn't see my VM
>> in tlv.
>>
>> This approach has 2 big flaws:
>>
>>- it requires quite a lot of resources for the grid to always be
>>there for us
>>
>> What about Saucelabs or another third party free tool?
>
>
>>
>>- it makes OST running times somehow undeterministic - situations,
>>where WebDriver has to wait for Selenium hub/nodes to be free, will
>>probably take place
>>
>> The way I see basic suite's UI sanity tests, is that they're exactly what
>> they're called - sanity tests.
>> We do trivial checks like "can we log in to the webadmin site", "can we
>> go to 'virtual machines' sub-page".
>> I'm not in favor of dropping these completely - I think they make sense,
>> but I also think we can live with a trimmed-down version that saves a lot
>> of time.
>> As I said - AFAIK QE have their own Selenium grid, where they run more
>> complex tests on the UI.
>>
>
> Yes, OST basic_ui_sanity tests aren't "compatibility" tests. We're not
> checking pixels or look. They are super simple "does the app load" tests,
> are very valuable, and we're not dropping them.
>
> Greg
>
> Regards, Marcin
>>
>>
>>
>> Best wishes,
>> Greg
>>
>> On Mon, Mar 4, 2019, 11:39 AM Marcin Sobczyk  wrote:
>>
>>> Hi,
>>>
>>> *TL; DR* Let's cut the running time of '008_basic_ui_sanity.py' by more
>>> than 3 minutes by sacrificing Firefox and Chrome screenshots in favor of
>>> Chromium.
>>> During the OST hackathon in Brno this year, I saw an opportunity to
>>> optimize basic UI sanity tests from basic suite.
>>> The way we currently run them, is by setting up a Selenium grid using 3
>>> docker containers, with a dedicated network... that's insanity! (pun
>>> intended).
>>> Let's a look at the running time of '008_basic_ui_sanity.py' scenario (
>>> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/4197/):
>>>
>>>
>>> 01:31:50 @ Run test: 008_basic_ui_sanity.py:
>>> 01:31:50 nose.config: INFO: Ignoring files matching ['^\\.', '^_',
>>> '^setup\\.py$']
>>> 01:31:50   # init:
>>> 01:31:50   # init: Success (in 0:00:00)
>>> 01:31:50   # start_grid:
>>> 01:34:05   # start_grid: Success (in 0:02:15)
>>> 01:34:05   # initialize_chrome:
>>> 01:34:18   # initialize_chrome: Success (in 0:00:13)
>>> 01:34:18   # login:
>>> 01:34:27   # login: Success (in 0:00:08)
>>> 01:34:27   # left_nav:
>>> 01:34:45   # left_nav: Success (in 0:00:18)
>>> 01:34:45   # close_driver:
>>> 01:34:46   # close_driver: Success (in 0:00:00)
>>> 01:34:46   # initialize_firefox:
>>> 01:35:02   # initialize_firefox: Success (in 0:00:16)
>>> 01:35:02   # login:
>>> 01:35:11   # login: Success (in 0:00:08)
>>> 01:35:11   # left_nav:
>>> 01:35:29   # left_nav: Success (in 0:00:18)
>>> 01:35:29   # cleanup:
>>> 01:35:36   # cleanup: Success (in 0:00:06)
>>> 01:35:36   # Results located at
>>> /dev/shm/ost/deployment-basic-suite-master/008_basic_ui_sanity.py.junit.xml
>>> 01:35:36 @ Run test: 008_basic_ui_sanity.py: Success (in 0

[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master (ovirt-engine-nodejs-modules) ] [ 27-02-2019 ] [ 002_bootstrap.add_vm2_lease ]

2019-03-07 Thread Dafna Ron
I report this issue to the list and the test owner is more then welcome to
respond.
Galit, can you please review the skiptest
https://gerrit.ovirt.org/#/c/98191/

Thanks,
Dafna


On Tue, Mar 5, 2019 at 2:43 PM Nir Soffer  wrote:

>
>
> On Tue, Mar 5, 2019, 13:27 Eyal Shenitzky 
>>
>>
>> On Tue, Mar 5, 2019 at 12:58 PM Dafna Ron  wrote:
>>
>>> Tal,
>>>
>>> I see the bug is in post but the patch was not merged yet:
>>> https://gerrit.ovirt.org/#/c/98191/
>>>
>>> can you tell me when will we merge the patch? as I rather not add
>>> SkipTest if this will be merged soon,
>>>
>>> thanks,
>>> Dafna
>>>
>>>
>>> On Mon, Mar 4, 2019 at 10:42 AM Dafna Ron  wrote:
>>>
 As I had another failure of this today I will be disabling this test
 until issue is resolved (https://bugzilla.redhat.com/1684267)

 Thanks,
 Dafna


 On Thu, Feb 28, 2019 at 8:48 PM Nir Soffer  wrote:

> On Thu, Feb 28, 2019 at 11:52 AM Dafna Ron  wrote:
>
>> Hi,
>>
>> We have a failure on the project in basic suite, master branch. The
>> recent failure was in patch:
>> https://gerrit.ovirt.org/#/c/98087/1 - Add pre-seed for ovirt-web-ui
>>
>> CQ is pointing at the below as the root cause (which was merged a
>> while back):
>> https://gerrit.ovirt.org/#/c/97491/ - Add pre-seed for ovirt-web-ui
>>
>> Can you please check the issue as it seems both patches are changing
>> the same thing and the project seem to be broken since
>> https://gerrit.ovirt.org/#/c/97491/3
>>
>> Latest failure:
>> https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/13171/
>>
>> Logs:
>>
>> https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/13171/artifact/basic-suite.el7.x86_64/test_logs/basic-suite-master/post-002_bootstrap.py/
>>
>> errors from logs:
>> Engine:
>>
>> 2019-02-27 13:37:28,479-05 ERROR
>> [org.ovirt.engine.core.bll.UpdateVmCommand] (default task-1) [74283e25]
>> Transaction rolled-back for command
>> 'org.ovirt.engine.core.bll.UpdateVmCommand'.
>> 2019-02-27 13:37:28,483-05 ERROR
>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>> (default task-1) [74283e25] EVENT_ID: USER_FAILED_UPDATE_VM(58), Failed 
>> to
>> update VM vm2 (User: admin@inter
>> nal-authz).
>> 2019-02-27 13:37:28,485-05 INFO
>> [org.ovirt.engine.core.bll.UpdateVmCommand] (default task-1) [74283e25]
>> Lock freed to object 'EngineLock:{exclusiveLocks='[vm2=VM_NAME]',
>> sharedLocks='[3500eb82-e5e2-4e24-b41c-ea
>> 02d9f6adee=VM]'}'
>> 2019-02-27 13:37:28,485-05 DEBUG
>> [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor]
>> (default task-1) [74283e25] method: runAction, params: [UpdateVm,
>> VmManagementParametersBase:{commandId='340
>> 59769-05b9-429e-8356-f6b9b9953f55', user='admin',
>> commandType='UpdateVm', vmId='3500eb82-e5e2-4e24-b41c-ea02d9f6adee'}],
>> timeElapsed: 6618ms
>> 2019-02-27 13:37:28,486-05 ERROR
>> [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default
>> task-1) [] Operation Failed: []
>> 2019-02-27 13:37:28,579-05 DEBUG
>> [org.ovirt.engine.core.common.di.interceptor.DebugLoggingInterceptor]
>> (EE-ManagedThreadFactory-engineScheduled-Thread-85) [] method: get, 
>> params:
>> [e29c0ba1-464c-4eb4-a8f2-c6933d9
>> 9969a], timeElapsed: 3ms
>>
>>
>> vdsm:
>>
>> 2019-02-27 13:37:21,987-0500 INFO  (jsonrpc/1) [vdsm.api] FINISH
>> lease_info error=No such lease 3500eb82-e5e2-4e24-b41c-ea02d9f6adee
>> from=:::192.168.201.4,43920,
>> flow_id=117dec74-ad59-4b12-8148-b2c130337c10,
>>  task_id=9c297d41-0aa7-4c74-b268-b710e666bc6c (api:52)
>> 2019-02-27 13:37:21,988-0500 ERROR (jsonrpc/1)
>> [storage.TaskManager.Task] (Task='9c297d41-0aa7-4c74-b268-b710e666bc6c')
>> Unexpected error (task:875)
>> Traceback (most recent call last):
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line
>> 882, in _run
>> return fn(*args, **kargs)
>>   File "", line 2, in lease_info
>>   File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line
>> 50, in method
>> ret = func(*args, **kwargs)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line
>> 3702, in lease_info
>> info = dom.lease_info(lease.lease_id)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line
>> 674, in lease_info
>> return vol.lookup(lease_id)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/xlease.py",
>> line 553, in lookup
>> raise NoSuchLease(lease_id)
>> NoSuchLease: No such lease 3500eb82-e5e2-4e24-b41c-ea02d9f6adee
>>
>
> This is not an error of vdsm. Someone asked for information about a
> non-existing lease,
> and vdsm fail the request with the expected error:
>>>

[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master ] [ 07-03-2019 ] [ / 004_basic_sanity.verify_glance_import]

2019-03-07 Thread Dafna Ron
passing build:
https://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/13311/
thanks!
Dafna


On Thu, Mar 7, 2019 at 3:07 PM Dafna Ron  wrote:

> monitoring.
> thanks
>
> On Thu, Mar 7, 2019 at 11:58 AM Fred Rolland  wrote:
>
>> Fix is merged
>>
>> On Thu, Mar 7, 2019 at 12:37 PM Dafna Ron  wrote:
>>
>>> first set of tests are failing on the same issue.
>>> if Benny says its a real issue then lets merge the fix
>>>
>>> Thanks,
>>> Dafna
>>>
>>>
>>> On Thu, Mar 7, 2019 at 10:18 AM Benny Zlotnik 
>>> wrote:
>>>
 It looks like a real issue, in patch[1] the name of the config value
 was changed, however, the name fn_db_add_config_value wasn't change, so it
 works when upgrading but not in a clean installation

 I have posted a fix[2]



 [1] -
 https://gerrit.ovirt.org/#/c/98228/3/packaging/dbscripts/upgrade/pre_upgrade/_config.sql
 [2] - https://gerrit.ovirt.org/#/c/98317/

 On Thu, Mar 7, 2019 at 12:13 PM Dafna Ron  wrote:

> We do not yet know if this is a real issue .
> I will update when I have more information.
>
> On Thu, Mar 7, 2019 at 10:04 AM Nir Soffer  wrote:
>
>> Did it fail once ir all build fail with error?
>>
>> If it failed only once, run again to ensure this is a real error.
>>
>> How many times this test fail in the past week, month, year?
>>
>> On Thu, Mar 7, 2019, 11:24 Galit Rosenthal > wrote:
>>
>>> Hi,
>>>
>>> we are failing basic suite master ( ovirt-engine)
>>>
>>> It looks a problem in ovirt-engine
>>>
>>> Can you please have a look at the issue?
>>>
>>> Thanks,
>>> Galit
>>>
>>> Error [1]:
>>> 'NoneType' object has no attribute 'status'
>>>
>>>  >> begin captured logging << 
>>> ovirtlago.testlib: ERROR: * Unhandled exception in >>  at 0x7f5e9feb3ed8>
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 
>>> 234, in assert_equals_within
>>> res = func()
>>>   File 
>>> "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
>>>  line 1153, in 
>>> lambda: api.disks.get(disk_name).status.state == 'ok',
>>> AttributeError: 'NoneType' object has no attribute 'status'
>>> - >> end captured logging << -
>>>
>>>
>>> US CQ of the ovirt-engine results can be found [2]
>>> We see the same error DS as well [3]
>>>
>>>
>>> [1]
>>> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/testReport/junit/(root)/004_basic_sanity/running_tests___basic_suite_el7_x86_64___verify_glance_import/
>>>
>>> [2]
>>> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/
>>>
>>> [3]
>>> https://rhv-devops-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/blue/rest/organizations/jenkins/pipelines/rhv-master_change-queue-tester/runs/1054/nodes/1758/steps/1883/log/?start=0
>>> --
>>>
>>> GALIT ROSENTHAL
>>>
>>> SOFTWARE ENGINEER
>>>
>>> Red Hat
>>>
>>> 
>>>
>>> ga...@gmail.comT: 972-9-7692230
>>> 
>>> ___
>>> Devel mailing list -- devel@ovirt.org
>>> To unsubscribe send an email to devel-le...@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/NUVPCGA3CD5NZTVCTJBLRAHII2I5XJ7O/
>>>
>> ___
> Devel mailing list -- devel@ovirt.org
> To unsubscribe send an email to devel-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/2UAAA77HDHUDWQJMNLRAQQTI23O4PCB5/
>
 ___
>>> Devel mailing list -- devel@ovirt.org
>>> To unsubscribe send an email to devel-le...@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/UO3LDUD36D6SZSIKI2GZBOS2Y2E4QRC3/
>>>
>>
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 

[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master ] [ 07-03-2019 ] [ / 004_basic_sanity.verify_glance_import]

2019-03-07 Thread Dafna Ron
monitoring.
thanks

On Thu, Mar 7, 2019 at 11:58 AM Fred Rolland  wrote:

> Fix is merged
>
> On Thu, Mar 7, 2019 at 12:37 PM Dafna Ron  wrote:
>
>> first set of tests are failing on the same issue.
>> if Benny says its a real issue then lets merge the fix
>>
>> Thanks,
>> Dafna
>>
>>
>> On Thu, Mar 7, 2019 at 10:18 AM Benny Zlotnik 
>> wrote:
>>
>>> It looks like a real issue, in patch[1] the name of the config value was
>>> changed, however, the name fn_db_add_config_value wasn't change, so it
>>> works when upgrading but not in a clean installation
>>>
>>> I have posted a fix[2]
>>>
>>>
>>>
>>> [1] -
>>> https://gerrit.ovirt.org/#/c/98228/3/packaging/dbscripts/upgrade/pre_upgrade/_config.sql
>>> [2] - https://gerrit.ovirt.org/#/c/98317/
>>>
>>> On Thu, Mar 7, 2019 at 12:13 PM Dafna Ron  wrote:
>>>
 We do not yet know if this is a real issue .
 I will update when I have more information.

 On Thu, Mar 7, 2019 at 10:04 AM Nir Soffer  wrote:

> Did it fail once ir all build fail with error?
>
> If it failed only once, run again to ensure this is a real error.
>
> How many times this test fail in the past week, month, year?
>
> On Thu, Mar 7, 2019, 11:24 Galit Rosenthal 
>> Hi,
>>
>> we are failing basic suite master ( ovirt-engine)
>>
>> It looks a problem in ovirt-engine
>>
>> Can you please have a look at the issue?
>>
>> Thanks,
>> Galit
>>
>> Error [1]:
>> 'NoneType' object has no attribute 'status'
>>
>>  >> begin captured logging << 
>> ovirtlago.testlib: ERROR: * Unhandled exception in >  at 0x7f5e9feb3ed8>
>> Traceback (most recent call last):
>>   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 
>> 234, in assert_equals_within
>> res = func()
>>   File 
>> "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
>>  line 1153, in 
>> lambda: api.disks.get(disk_name).status.state == 'ok',
>> AttributeError: 'NoneType' object has no attribute 'status'
>> - >> end captured logging << -
>>
>>
>> US CQ of the ovirt-engine results can be found [2]
>> We see the same error DS as well [3]
>>
>>
>> [1]
>> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/testReport/junit/(root)/004_basic_sanity/running_tests___basic_suite_el7_x86_64___verify_glance_import/
>>
>> [2]
>> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/
>>
>> [3]
>> https://rhv-devops-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/blue/rest/organizations/jenkins/pipelines/rhv-master_change-queue-tester/runs/1054/nodes/1758/steps/1883/log/?start=0
>> --
>>
>> GALIT ROSENTHAL
>>
>> SOFTWARE ENGINEER
>>
>> Red Hat
>>
>> 
>>
>> ga...@gmail.comT: 972-9-7692230
>> 
>> ___
>> Devel mailing list -- devel@ovirt.org
>> To unsubscribe send an email to devel-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/NUVPCGA3CD5NZTVCTJBLRAHII2I5XJ7O/
>>
> ___
 Devel mailing list -- devel@ovirt.org
 To unsubscribe send an email to devel-le...@ovirt.org
 Privacy Statement: https://www.ovirt.org/site/privacy-policy/
 oVirt Code of Conduct:
 https://www.ovirt.org/community/about/community-guidelines/
 List Archives:
 https://lists.ovirt.org/archives/list/devel@ovirt.org/message/2UAAA77HDHUDWQJMNLRAQQTI23O4PCB5/

>>> ___
>> Devel mailing list -- devel@ovirt.org
>> To unsubscribe send an email to devel-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/UO3LDUD36D6SZSIKI2GZBOS2Y2E4QRC3/
>>
>
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/HYXZKT5JDVCVD3J2EPAMN7DMJQICW3A5/


[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master ] [ 07-03-2019 ] [ / 004_basic_sanity.verify_glance_import]

2019-03-07 Thread Fred Rolland
Fix is merged

On Thu, Mar 7, 2019 at 12:37 PM Dafna Ron  wrote:

> first set of tests are failing on the same issue.
> if Benny says its a real issue then lets merge the fix
>
> Thanks,
> Dafna
>
>
> On Thu, Mar 7, 2019 at 10:18 AM Benny Zlotnik  wrote:
>
>> It looks like a real issue, in patch[1] the name of the config value was
>> changed, however, the name fn_db_add_config_value wasn't change, so it
>> works when upgrading but not in a clean installation
>>
>> I have posted a fix[2]
>>
>>
>>
>> [1] -
>> https://gerrit.ovirt.org/#/c/98228/3/packaging/dbscripts/upgrade/pre_upgrade/_config.sql
>> [2] - https://gerrit.ovirt.org/#/c/98317/
>>
>> On Thu, Mar 7, 2019 at 12:13 PM Dafna Ron  wrote:
>>
>>> We do not yet know if this is a real issue .
>>> I will update when I have more information.
>>>
>>> On Thu, Mar 7, 2019 at 10:04 AM Nir Soffer  wrote:
>>>
 Did it fail once ir all build fail with error?

 If it failed only once, run again to ensure this is a real error.

 How many times this test fail in the past week, month, year?

 On Thu, Mar 7, 2019, 11:24 Galit Rosenthal >>>
> Hi,
>
> we are failing basic suite master ( ovirt-engine)
>
> It looks a problem in ovirt-engine
>
> Can you please have a look at the issue?
>
> Thanks,
> Galit
>
> Error [1]:
> 'NoneType' object has no attribute 'status'
>
>  >> begin captured logging << 
> ovirtlago.testlib: ERROR: * Unhandled exception in  
> at 0x7f5e9feb3ed8>
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 234, 
> in assert_equals_within
> res = func()
>   File 
> "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
>  line 1153, in 
> lambda: api.disks.get(disk_name).status.state == 'ok',
> AttributeError: 'NoneType' object has no attribute 'status'
> - >> end captured logging << -
>
>
> US CQ of the ovirt-engine results can be found [2]
> We see the same error DS as well [3]
>
>
> [1]
> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/testReport/junit/(root)/004_basic_sanity/running_tests___basic_suite_el7_x86_64___verify_glance_import/
>
> [2]
> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/
>
> [3]
> https://rhv-devops-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/blue/rest/organizations/jenkins/pipelines/rhv-master_change-queue-tester/runs/1054/nodes/1758/steps/1883/log/?start=0
> --
>
> GALIT ROSENTHAL
>
> SOFTWARE ENGINEER
>
> Red Hat
>
> 
>
> ga...@gmail.comT: 972-9-7692230
> 
> ___
> Devel mailing list -- devel@ovirt.org
> To unsubscribe send an email to devel-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/NUVPCGA3CD5NZTVCTJBLRAHII2I5XJ7O/
>
 ___
>>> Devel mailing list -- devel@ovirt.org
>>> To unsubscribe send an email to devel-le...@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/2UAAA77HDHUDWQJMNLRAQQTI23O4PCB5/
>>>
>> ___
> Devel mailing list -- devel@ovirt.org
> To unsubscribe send an email to devel-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/UO3LDUD36D6SZSIKI2GZBOS2Y2E4QRC3/
>
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/EDJZHI3A4OPY57VHAQVAW7F3CZ5Q4U2M/


[ovirt-devel] Re: [ovirt-users] Please don't remove instance type

2019-03-07 Thread Michal Skrivanek


> On 7 Mar 2019, at 10:43, Baptiste Agasse  
> wrote:
> 
> - Le 15 Fév 19, à 16:36, Michal Skrivanek  a 
> écrit :
> 
> 
> On 12 Feb 2019, at 22:21, Greg Sheremeta  > wrote:
> 
> Hi!
> 
> On Sat, Feb 2, 2019 at 1:35 PM Baptiste Agasse 
> mailto:baptiste.aga...@lyra-network.com>> 
> wrote:
> Hi all,
> 
> We are happy oVirt users for some years now (we started with 3.6, now on 4.2) 
> and we manage most of our virtualization stacks with it. To provision and 
> manage our machines, we use the foreman (for bare metal and virtual machines) 
> on top of it. I made some little contributions to the foreman and other 
> underlying stuff to have a deeper integration with oVirt, like to be able to 
> select instance type directly from foreman interface/api and we rely on it. 
> We use instance types to standardize our vms by defining system resources 
> (memory, cpu and cpu topology) console type, boot options. On top of that we 
> plan to use templates to apply OS (CentOS 7 and CentOS 6 actually). Having 
> resources definitions separated from OS installation help us to keep instance 
> types and templates lists small and don't bother users about some technical 
> underlying stuff. As we are interested in automating oVirt maintenance tasks 
> and configuration with ansible, I asked at FOSDEM oVirt booth if there is any 
> ansible module to manage instance types in Ovirt as I didn't find it in ovirt 
> ansible infra repo. The person to whom I asked the question said that you are 
> planning to remove instance types from ovirt, and this make me sad :(. So 
> here I am to ask why do you plan to remove instance types from oVirt. As far 
> as I know, it's fairly common to have "instance types" / "flavors" / "sizes" 
> on one side and then templates (bare OS, preinstalled appliances...) on other 
> side and pick one of each to make an instance. If this first part is missing 
> in future version of ovirt, it will be a pain point for us. So, my question 
> is, do you really plan to remove instances type definitely ?
> 
> I don't know the future plans (maybe someone else can comment), but I have 
> heard that instance types are barely used. You might be the first person I 
> know of who is using them.
> 
> The argument for keeping templates but removing instance types is probably 
> that templates already are effectively instance types. That's why I never use 
> them. For example, create a CentOS template with 16 CPUs, 32GB RAM, 500GB 
> disk ... that's effectively a large instance type. Create another template 
> with 1 CPU, 2GB RAM, 30GB disk ... that's effectively a small instance type.
> 
> Is there a use case beyond this that instance types provide that templates 
> don’t?
> 
> It was supposed to give better abstraction for hw, and more importantly 
> something you can change later on and it gets reflected in all VMs using that 
> type. Problem with that is that it got quite complex and we never really 
> found the right cut between what belongs to Instance and what to Template. It 
> works…but there are few corner cases here and there which are quite difficult 
> to fix. 
> 
> But no, we do not plan to remove them. It’s just in a deep maintenance mode 
> where we don’t really invest time to cover REST, ansible and all the bells 
> and whistles. If it works for you, great. If not and you would want to submit 
> a fix then please feel free to do so too.
> 
> Thanks,
> michal
> 
> Hi,
> 
> First, thank you both for your answers, and sorry for the delay. To be more 
> clear on how we use instance types and templates, we use it like our external 
> cloud providers use this kind of concepts:
> 
> * Templates is a pre-provisionned OS (and maybe one or more application 
> installed on top of it). Template needs storage space on storage domain(s) to 
> be stored. 
> * Instance types are "size" and other "metadata" applied to the template/VM 
> (eg: CPUs/Cores count, RAM size, headless VM, SPICE, or VNC ?, scsi support ? 
> HA ? ...). You can have a lot of "profiles" at "no cost" because it's just 
> configuration

yep, that was the idea for them. It’s just that not enough people expressed 
their interest in this feature…

> 
> IMHO, on our workload, as we have a limited set of templates but a lot of 
> different "sizes/types" of VMs. If instead of using instance types +  
> templates, we only use templates we will have a lot of templates with mostly 
> the same OSes/application preinstalled on it and the maintenance/storage 
> costs will be a lot greater than today. For some corner cases, we also have 
> "non templated" VMs and we apply instance type to it too (we enforce that any 
> VM, build from template or not on our clusters have an instance type applied 
> to it)
> 
> We are greatly interested in ansible modules to maintain and configure our 
> multiple oVirt stacks. I know that you will not invest time in providing 
> ansible module for instance types as you said that part of ovirt is in

[ovirt-devel] Re: Taking down the engine without setting global maintenance

2019-03-07 Thread Martin Sivak
> When the engine goes down, it can't know if it's part of a
> graceful/clean reboot. It can be due to a problem, which is severe
> enough to take the machine down and not take it up again, but still
> not severe enough to prevent clean shutdown of the engine itself.

That and the fact that the engine does not have a direct access to
storage and has to go through vdsm. Meaning the flagging mechanism
might not be reliable enough. Also there has to be an automatic way to
reset it and make the VM start again or the user will wonder why the
next outage left the engine down.

Figuring out the rules for the two automatic actions (get GM and reset
GM) is not trivial.

> perhaps at least
> make HA wait longer (say, 30 minutes), and/or notify a few times by
> email, or something like that.

The delay is at least 5 minutes before shutdown is initiated and you
will get couple of emails (well at least one, I do not remember how
often we repeat it).

> I simply wonder how many times HA actually saved people from a
> long(er) unplanned engine downtime compared to how many times it was
> simply annoyingly restarting the vm in the middle of some routine
> maintenance...

You do not touch the engine that often and that is why people forget
the right procedure. Engine offline "migration" was visible in many
logs I reviewed during bug analyses. So this is probably working well
for most people most of the time (eg. when nothing is changing).

Martin

On Thu, Mar 7, 2019 at 11:04 AM Yedidyah Bar David  wrote:
>
> On Thu, Mar 7, 2019 at 11:41 AM Simone Tiraboschi  wrote:
> >
> >
> >
> > On Thu, Mar 7, 2019 at 10:34 AM Yedidyah Bar David  wrote:
> >>
> >> On Thu, Mar 7, 2019 at 11:30 AM Martin Sivak  wrote:
> >> >
> >> > Hi,
> >> >
> >> > there is no way to distinguish an engine that is not responsive
> >> > (software or network issue) from a VM that is being powered off. The
> >> > shutdown takes some time during which you just do not know.
> >>
> >> _I_ do not know, but the user might still know beforehand.
> >>
> >> > Global
> >> > maintenance informs the tooling in advance that something like this is
> >> > going to happen.
> >>
> >> Yes. But users keep forgetting setting it. So I am trying to come up
> >> with something that will fix that :-)
> >
> >
> > Now we have exactly the opposite:
> > engine-setup is already checking for global maintenance mode (the check 
> > acts on the engine DB over what the hosts report when polled so we have a 
> > bit of latency here) and engine-setup is exiting if we are on hosted-engine 
> > and not in global maintenance mode.
> > https://github.com/oVirt/ovirt-engine/blob/master/packaging/setup/plugins/ovirt-engine-common/ovirt-engine/system/he.py#L49
>
> You are right, if the engine restart was only via engine-setup. But
> there might be other reasons for restarting.
>
> Martin's claim, AFAIU, is more-or-less:
>
> When the engine goes down, it can't know if it's part of a
> graceful/clean reboot. It can be due to a problem, which is severe
> enough to take the machine down and not take it up again, but still
> not severe enough to prevent clean shutdown of the engine itself.
>
> Martin - is it so?
>
> Not sure I agree personally, that this flow is likely enough to make
> my suggestion problematic (meaning, will cause HA to leave the engine
> vm down, when it was actually better to try starting it on another
> host). But I can see the point.
>
> Let's say that I mainly think we should differentiate between a clean
> shutdown and a non-responsive engine (died via a power cut, or network
> problem, or whatever). If we do not want to consider this as a "global
> maintenance" (meaning, do nothing until the user clears it, or if we
> set it ourselves, until the engine starts again), perhaps at least
> make HA wait longer (say, 30 minutes), and/or notify a few times by
> email, or something like that.
>
> I simply wonder how many times HA actually saved people from a
> long(er) unplanned engine downtime compared to how many times it was
> simply annoyingly restarting the vm in the middle of some routine
> maintenance...
>
> >
> >
> >>
> >>
> >> Perhaps instead of my original text, use something like "Right before
> >> the engine goes down, it should set global maintenance".
> >>
> >> >
> >> > Who do you expect should be touching the shared storage? The engine VM
> >> > itself? That might be possible, but remember the jboss instance is
> >> > just the top of the process hierarchy. There are a lot of components
> >> > where something might break during shutdown (filesystem umount timeout
> >> > for example).
> >>
> >> I did say "engine", not "engine vm". But see above for perhaps clearer
> >> text.
> >>
> >> >
> >> > Martin
> >> >
> >> > On Thu, Mar 7, 2019 at 9:27 AM Yedidyah Bar David  
> >> > wrote:
> >> > >
> >> > > Hi all,
> >> > >
> >> > > How about making this change:
> >> > >
> >> > > Right before the engine goes down cleanly, it marks the shared storage
> >> > > saying it did not crash but

[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master ] [ 07-03-2019 ] [ / 004_basic_sanity.verify_glance_import]

2019-03-07 Thread Dafna Ron
first set of tests are failing on the same issue.
if Benny says its a real issue then lets merge the fix

Thanks,
Dafna


On Thu, Mar 7, 2019 at 10:18 AM Benny Zlotnik  wrote:

> It looks like a real issue, in patch[1] the name of the config value was
> changed, however, the name fn_db_add_config_value wasn't change, so it
> works when upgrading but not in a clean installation
>
> I have posted a fix[2]
>
>
>
> [1] -
> https://gerrit.ovirt.org/#/c/98228/3/packaging/dbscripts/upgrade/pre_upgrade/_config.sql
> [2] - https://gerrit.ovirt.org/#/c/98317/
>
> On Thu, Mar 7, 2019 at 12:13 PM Dafna Ron  wrote:
>
>> We do not yet know if this is a real issue .
>> I will update when I have more information.
>>
>> On Thu, Mar 7, 2019 at 10:04 AM Nir Soffer  wrote:
>>
>>> Did it fail once ir all build fail with error?
>>>
>>> If it failed only once, run again to ensure this is a real error.
>>>
>>> How many times this test fail in the past week, month, year?
>>>
>>> On Thu, Mar 7, 2019, 11:24 Galit Rosenthal >>
 Hi,

 we are failing basic suite master ( ovirt-engine)

 It looks a problem in ovirt-engine

 Can you please have a look at the issue?

 Thanks,
 Galit

 Error [1]:
 'NoneType' object has no attribute 'status'

  >> begin captured logging << 
 ovirtlago.testlib: ERROR: * Unhandled exception in  
 at 0x7f5e9feb3ed8>
 Traceback (most recent call last):
   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 234, 
 in assert_equals_within
 res = func()
   File 
 "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
  line 1153, in 
 lambda: api.disks.get(disk_name).status.state == 'ok',
 AttributeError: 'NoneType' object has no attribute 'status'
 - >> end captured logging << -


 US CQ of the ovirt-engine results can be found [2]
 We see the same error DS as well [3]


 [1]
 https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/testReport/junit/(root)/004_basic_sanity/running_tests___basic_suite_el7_x86_64___verify_glance_import/

 [2]
 https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/

 [3]
 https://rhv-devops-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/blue/rest/organizations/jenkins/pipelines/rhv-master_change-queue-tester/runs/1054/nodes/1758/steps/1883/log/?start=0
 --

 GALIT ROSENTHAL

 SOFTWARE ENGINEER

 Red Hat

 

 ga...@gmail.comT: 972-9-7692230
 
 ___
 Devel mailing list -- devel@ovirt.org
 To unsubscribe send an email to devel-le...@ovirt.org
 Privacy Statement: https://www.ovirt.org/site/privacy-policy/
 oVirt Code of Conduct:
 https://www.ovirt.org/community/about/community-guidelines/
 List Archives:
 https://lists.ovirt.org/archives/list/devel@ovirt.org/message/NUVPCGA3CD5NZTVCTJBLRAHII2I5XJ7O/

>>> ___
>> Devel mailing list -- devel@ovirt.org
>> To unsubscribe send an email to devel-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/2UAAA77HDHUDWQJMNLRAQQTI23O4PCB5/
>>
>
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/UO3LDUD36D6SZSIKI2GZBOS2Y2E4QRC3/


[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master ] [ 07-03-2019 ] [ / 004_basic_sanity.verify_glance_import]

2019-03-07 Thread Benny Zlotnik
It looks like a real issue, in patch[1] the name of the config value was
changed, however, the name fn_db_add_config_value wasn't change, so it
works when upgrading but not in a clean installation

I have posted a fix[2]



[1] -
https://gerrit.ovirt.org/#/c/98228/3/packaging/dbscripts/upgrade/pre_upgrade/_config.sql
[2] - https://gerrit.ovirt.org/#/c/98317/

On Thu, Mar 7, 2019 at 12:13 PM Dafna Ron  wrote:

> We do not yet know if this is a real issue .
> I will update when I have more information.
>
> On Thu, Mar 7, 2019 at 10:04 AM Nir Soffer  wrote:
>
>> Did it fail once ir all build fail with error?
>>
>> If it failed only once, run again to ensure this is a real error.
>>
>> How many times this test fail in the past week, month, year?
>>
>> On Thu, Mar 7, 2019, 11:24 Galit Rosenthal >
>>> Hi,
>>>
>>> we are failing basic suite master ( ovirt-engine)
>>>
>>> It looks a problem in ovirt-engine
>>>
>>> Can you please have a look at the issue?
>>>
>>> Thanks,
>>> Galit
>>>
>>> Error [1]:
>>> 'NoneType' object has no attribute 'status'
>>>
>>>  >> begin captured logging << 
>>> ovirtlago.testlib: ERROR: * Unhandled exception in  
>>> at 0x7f5e9feb3ed8>
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 234, 
>>> in assert_equals_within
>>> res = func()
>>>   File 
>>> "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
>>>  line 1153, in 
>>> lambda: api.disks.get(disk_name).status.state == 'ok',
>>> AttributeError: 'NoneType' object has no attribute 'status'
>>> - >> end captured logging << -
>>>
>>>
>>> US CQ of the ovirt-engine results can be found [2]
>>> We see the same error DS as well [3]
>>>
>>>
>>> [1]
>>> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/testReport/junit/(root)/004_basic_sanity/running_tests___basic_suite_el7_x86_64___verify_glance_import/
>>>
>>> [2]
>>> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/
>>>
>>> [3]
>>> https://rhv-devops-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/blue/rest/organizations/jenkins/pipelines/rhv-master_change-queue-tester/runs/1054/nodes/1758/steps/1883/log/?start=0
>>> --
>>>
>>> GALIT ROSENTHAL
>>>
>>> SOFTWARE ENGINEER
>>>
>>> Red Hat
>>>
>>> 
>>>
>>> ga...@gmail.comT: 972-9-7692230
>>> 
>>> ___
>>> Devel mailing list -- devel@ovirt.org
>>> To unsubscribe send an email to devel-le...@ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/NUVPCGA3CD5NZTVCTJBLRAHII2I5XJ7O/
>>>
>> ___
> Devel mailing list -- devel@ovirt.org
> To unsubscribe send an email to devel-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/2UAAA77HDHUDWQJMNLRAQQTI23O4PCB5/
>
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/U5Y5DEFAOGXRUMXINUZDUUOSQOPOHWMI/


[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master ] [ 07-03-2019 ] [ / 004_basic_sanity.verify_glance_import]

2019-03-07 Thread Dafna Ron
We do not yet know if this is a real issue .
I will update when I have more information.

On Thu, Mar 7, 2019 at 10:04 AM Nir Soffer  wrote:

> Did it fail once ir all build fail with error?
>
> If it failed only once, run again to ensure this is a real error.
>
> How many times this test fail in the past week, month, year?
>
> On Thu, Mar 7, 2019, 11:24 Galit Rosenthal 
>> Hi,
>>
>> we are failing basic suite master ( ovirt-engine)
>>
>> It looks a problem in ovirt-engine
>>
>> Can you please have a look at the issue?
>>
>> Thanks,
>> Galit
>>
>> Error [1]:
>> 'NoneType' object has no attribute 'status'
>>
>>  >> begin captured logging << 
>> ovirtlago.testlib: ERROR: * Unhandled exception in  at 
>> 0x7f5e9feb3ed8>
>> Traceback (most recent call last):
>>   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 234, in 
>> assert_equals_within
>> res = func()
>>   File 
>> "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
>>  line 1153, in 
>> lambda: api.disks.get(disk_name).status.state == 'ok',
>> AttributeError: 'NoneType' object has no attribute 'status'
>> - >> end captured logging << -
>>
>>
>> US CQ of the ovirt-engine results can be found [2]
>> We see the same error DS as well [3]
>>
>>
>> [1]
>> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/testReport/junit/(root)/004_basic_sanity/running_tests___basic_suite_el7_x86_64___verify_glance_import/
>>
>> [2]
>> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/
>>
>> [3]
>> https://rhv-devops-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/blue/rest/organizations/jenkins/pipelines/rhv-master_change-queue-tester/runs/1054/nodes/1758/steps/1883/log/?start=0
>> --
>>
>> GALIT ROSENTHAL
>>
>> SOFTWARE ENGINEER
>>
>> Red Hat
>>
>> 
>>
>> ga...@gmail.comT: 972-9-7692230
>> 
>> ___
>> Devel mailing list -- devel@ovirt.org
>> To unsubscribe send an email to devel-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/NUVPCGA3CD5NZTVCTJBLRAHII2I5XJ7O/
>>
>
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/2UAAA77HDHUDWQJMNLRAQQTI23O4PCB5/


[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master ] [ 07-03-2019 ] [ / 004_basic_sanity.verify_glance_import]

2019-03-07 Thread Nir Soffer
Did it fail once ir all build fail with error?

If it failed only once, run again to ensure this is a real error.

How many times this test fail in the past week, month, year?

On Thu, Mar 7, 2019, 11:24 Galit Rosenthal  Hi,
>
> we are failing basic suite master ( ovirt-engine)
>
> It looks a problem in ovirt-engine
>
> Can you please have a look at the issue?
>
> Thanks,
> Galit
>
> Error [1]:
> 'NoneType' object has no attribute 'status'
>
>  >> begin captured logging << 
> ovirtlago.testlib: ERROR: * Unhandled exception in  at 
> 0x7f5e9feb3ed8>
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 234, in 
> assert_equals_within
> res = func()
>   File 
> "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
>  line 1153, in 
> lambda: api.disks.get(disk_name).status.state == 'ok',
> AttributeError: 'NoneType' object has no attribute 'status'
> - >> end captured logging << -
>
>
> US CQ of the ovirt-engine results can be found [2]
> We see the same error DS as well [3]
>
>
> [1]
> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/testReport/junit/(root)/004_basic_sanity/running_tests___basic_suite_el7_x86_64___verify_glance_import/
>
> [2]
> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/
>
> [3]
> https://rhv-devops-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/blue/rest/organizations/jenkins/pipelines/rhv-master_change-queue-tester/runs/1054/nodes/1758/steps/1883/log/?start=0
> --
>
> GALIT ROSENTHAL
>
> SOFTWARE ENGINEER
>
> Red Hat
>
> 
>
> ga...@gmail.comT: 972-9-7692230
> 
> ___
> Devel mailing list -- devel@ovirt.org
> To unsubscribe send an email to devel-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/NUVPCGA3CD5NZTVCTJBLRAHII2I5XJ7O/
>
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/QDRA5FUS4BXT7FOJET245D3WNNWXL37K/


[ovirt-devel] Re: Taking down the engine without setting global maintenance

2019-03-07 Thread Yedidyah Bar David
On Thu, Mar 7, 2019 at 11:41 AM Simone Tiraboschi  wrote:
>
>
>
> On Thu, Mar 7, 2019 at 10:34 AM Yedidyah Bar David  wrote:
>>
>> On Thu, Mar 7, 2019 at 11:30 AM Martin Sivak  wrote:
>> >
>> > Hi,
>> >
>> > there is no way to distinguish an engine that is not responsive
>> > (software or network issue) from a VM that is being powered off. The
>> > shutdown takes some time during which you just do not know.
>>
>> _I_ do not know, but the user might still know beforehand.
>>
>> > Global
>> > maintenance informs the tooling in advance that something like this is
>> > going to happen.
>>
>> Yes. But users keep forgetting setting it. So I am trying to come up
>> with something that will fix that :-)
>
>
> Now we have exactly the opposite:
> engine-setup is already checking for global maintenance mode (the check acts 
> on the engine DB over what the hosts report when polled so we have a bit of 
> latency here) and engine-setup is exiting if we are on hosted-engine and not 
> in global maintenance mode.
> https://github.com/oVirt/ovirt-engine/blob/master/packaging/setup/plugins/ovirt-engine-common/ovirt-engine/system/he.py#L49

You are right, if the engine restart was only via engine-setup. But
there might be other reasons for restarting.

Martin's claim, AFAIU, is more-or-less:

When the engine goes down, it can't know if it's part of a
graceful/clean reboot. It can be due to a problem, which is severe
enough to take the machine down and not take it up again, but still
not severe enough to prevent clean shutdown of the engine itself.

Martin - is it so?

Not sure I agree personally, that this flow is likely enough to make
my suggestion problematic (meaning, will cause HA to leave the engine
vm down, when it was actually better to try starting it on another
host). But I can see the point.

Let's say that I mainly think we should differentiate between a clean
shutdown and a non-responsive engine (died via a power cut, or network
problem, or whatever). If we do not want to consider this as a "global
maintenance" (meaning, do nothing until the user clears it, or if we
set it ourselves, until the engine starts again), perhaps at least
make HA wait longer (say, 30 minutes), and/or notify a few times by
email, or something like that.

I simply wonder how many times HA actually saved people from a
long(er) unplanned engine downtime compared to how many times it was
simply annoyingly restarting the vm in the middle of some routine
maintenance...

>
>
>>
>>
>> Perhaps instead of my original text, use something like "Right before
>> the engine goes down, it should set global maintenance".
>>
>> >
>> > Who do you expect should be touching the shared storage? The engine VM
>> > itself? That might be possible, but remember the jboss instance is
>> > just the top of the process hierarchy. There are a lot of components
>> > where something might break during shutdown (filesystem umount timeout
>> > for example).
>>
>> I did say "engine", not "engine vm". But see above for perhaps clearer
>> text.
>>
>> >
>> > Martin
>> >
>> > On Thu, Mar 7, 2019 at 9:27 AM Yedidyah Bar David  wrote:
>> > >
>> > > Hi all,
>> > >
>> > > How about making this change:
>> > >
>> > > Right before the engine goes down cleanly, it marks the shared storage
>> > > saying it did not crash but exited cleanly, and then HE-HA will not
>> > > try to restart it on another host. Perhaps make this optional, so that
>> > > users can do clean shutdowns and still test HA cleanly (or some other
>> > > use cases, where users might not want this).
>> > >
>> > > This should help a lot cases where people restarted their engine for
>> > > some reason, e.g. upgrade, and forgot to set maintenance.
>> > >
>> > > Makes sense?
>> > > --
>> > > Didi
>> > > ___
>> > > Devel mailing list -- devel@ovirt.org
>> > > To unsubscribe send an email to devel-le...@ovirt.org
>> > > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> > > oVirt Code of Conduct: 
>> > > https://www.ovirt.org/community/about/community-guidelines/
>> > > List Archives: 
>> > > https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WCLSLEVXPHGRHL5BJHPLSYWPPOCMIJOQ/
>>
>>
>>
>> --
>> Didi



-- 
Didi
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/OLDLV3DNXTOYNH5GKTYJ7NEGXHLJ3A7M/


[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master ] [ 07-03-2019 ] [ 008_basic_ui_sanity.initialize_firefox]

2019-03-07 Thread Nir Soffer
Please run OST again to ensure we have a real issue and not a random
failure.

On Thu, Mar 7, 2019, 10:08 Marcin Sobczyk  Hi,
>
> none of my UI-sanity-tests-optimization patches was merged so this is not
> related to my recent work.
>
> Regards, Marcin
> On 3/7/19 8:48 AM, Galit Rosenthal wrote:
>
> Hi,
>
> we are failing basic suite master (vdsm)
>
> Can you please have a look at the issue?
>
> *Error Message:*
>
> Message: Error forwarding the new session new session request for webdriver 
> should contain a location header or an 'application/json;charset=UTF-8' 
> response body with the session ID.
>
>  *from the [2]:*
>
>  *08:22:04* + xmllint --format 
> /dev/shm/ost/deployment-basic-suite-master/008_basic_ui_sanity.py.junit.xml*08:22:04*
>  *08:22:04*  tests="7" errors="1" failures="0" skip="0">*08:22:04*classname="008_basic_ui_sanity" name="init" time="0.001"/>*08:22:04*   
>  time="144.413">*08:22:04* 

[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master ] [ 07-03-2019 ] [ / 004_basic_sanity.verify_glance_import]

2019-03-07 Thread Dafna Ron
This run is from 6:47 in the morning and we merged a patch to OST right
before.
I want to make sure we have a real issue here and right now, CQ is running
on 8 changes (including engine)
I will update if this is an actual issue and the suspected root cause once
CQ finishes the tests.

Thanks,
Dafna


On Thu, Mar 7, 2019 at 9:37 AM Benny Zlotnik  wrote:

> This seems to be the issue: 2019-03-07 02:01:12,421-05 ERROR
> [org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (default task-1)
> [d2ac1504-3c2c-461a-adfb-332f5e4bf14f] Error during ValidateFailure.:
> java.lang.IllegalArgumentException: MaxBlockDiskSizeInGibiBytes has no
> value for version: general
>
>
> On Thu, Mar 7, 2019 at 11:24 AM Galit Rosenthal 
> wrote:
>
>> Hi,
>>
>> we are failing basic suite master ( ovirt-engine)
>>
>> It looks a problem in ovirt-engine
>>
>> Can you please have a look at the issue?
>>
>> Thanks,
>> Galit
>>
>> Error [1]:
>> 'NoneType' object has no attribute 'status'
>>
>>  >> begin captured logging << 
>> ovirtlago.testlib: ERROR: * Unhandled exception in  at 
>> 0x7f5e9feb3ed8>
>> Traceback (most recent call last):
>>   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 234, in 
>> assert_equals_within
>> res = func()
>>   File 
>> "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
>>  line 1153, in 
>> lambda: api.disks.get(disk_name).status.state == 'ok',
>> AttributeError: 'NoneType' object has no attribute 'status'
>> - >> end captured logging << -
>>
>>
>> US CQ of the ovirt-engine results can be found [2]
>> We see the same error DS as well [3]
>>
>>
>> [1]
>> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/testReport/junit/(root)/004_basic_sanity/running_tests___basic_suite_el7_x86_64___verify_glance_import/
>>
>> [2]
>> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/
>>
>> [3]
>> https://rhv-devops-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/blue/rest/organizations/jenkins/pipelines/rhv-master_change-queue-tester/runs/1054/nodes/1758/steps/1883/log/?start=0
>> --
>>
>> GALIT ROSENTHAL
>>
>> SOFTWARE ENGINEER
>>
>> Red Hat
>>
>> 
>>
>> ga...@gmail.comT: 972-9-7692230
>> 
>> ___
>> Devel mailing list -- devel@ovirt.org
>> To unsubscribe send an email to devel-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/NUVPCGA3CD5NZTVCTJBLRAHII2I5XJ7O/
>>
>
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WZSIQQJUMEXIA4UFJZL2NLGNCYIJ57RX/


[ovirt-devel] Re: [ovirt-users] Please don't remove instance type

2019-03-07 Thread Baptiste Agasse
- Le 15 Fév 19, à 16:36, Michal Skrivanek  a 
écrit : 





BQ_BEGIN

On 12 Feb 2019, at 22:21, Greg Sheremeta < [ mailto:gsher...@redhat.com | 
gsher...@redhat.com ] > wrote: 

Hi! 

On Sat, Feb 2, 2019 at 1:35 PM Baptiste Agasse < [ 
mailto:baptiste.aga...@lyra-network.com | baptiste.aga...@lyra-network.com ] > 
wrote: 

BQ_BEGIN
Hi all, 

We are happy oVirt users for some years now (we started with 3.6, now on 4.2) 
and we manage most of our virtualization stacks with it. To provision and 
manage our machines, we use the foreman (for bare metal and virtual machines) 
on top of it. I made some little contributions to the foreman and other 
underlying stuff to have a deeper integration with oVirt, like to be able to 
select instance type directly from foreman interface/api and we rely on it. We 
use instance types to standardize our vms by defining system resources (memory, 
cpu and cpu topology) console type, boot options. On top of that we plan to use 
templates to apply OS (CentOS 7 and CentOS 6 actually). Having resources 
definitions separated from OS installation help us to keep instance types and 
templates lists small and don't bother users about some technical underlying 
stuff. As we are interested in automating oVirt maintenance tasks and 
configuration with ansible, I asked at FOSDEM oVirt booth if there is any 
ansible module to manage instance types in Ovirt as I didn't find it in ovirt 
ansible infra repo. The person to whom I asked the question said that you are 
planning to remove instance types from ovirt, and this make me sad :(. So here 
I am to ask why do you plan to remove instance types from oVirt. As far as I 
know, it's fairly common to have "instance types" / "flavors" / "sizes" on one 
side and then templates (bare OS, preinstalled appliances...) on other side and 
pick one of each to make an instance. If this first part is missing in future 
version of ovirt, it will be a pain point for us. So, my question is, do you 
really plan to remove instances type definitely ? 




I don't know the future plans (maybe someone else can comment), but I have 
heard that instance types are barely used. You might be the first person I know 
of who is using them. 

The argument for keeping templates but removing instance types is probably that 
templates already are effectively instance types. That's why I never use them. 
For example, create a CentOS template with 16 CPUs, 32GB RAM, 500GB disk ... 
that's effectively a large instance type. Create another template with 1 CPU, 
2GB RAM, 30GB disk ... that's effectively a small instance type. 

Is there a use case beyond this that instance types provide that templates 
don’t? 

BQ_END


It was supposed to give better abstraction for hw, and more importantly 
something you can change later on and it gets reflected in all VMs using that 
type. Problem with that is that it got quite complex and we never really found 
the right cut between what belongs to Instance and what to Template. It 
works…but there are few corner cases here and there which are quite difficult 
to fix. 

But no, we do not plan to remove them. It’s just in a deep maintenance mode 
where we don’t really invest time to cover REST, ansible and all the bells and 
whistles. If it works for you, great. If not and you would want to submit a fix 
then please feel free to do so too. 

Thanks, 
michal 

BQ_END

Hi, 

First, thank you both for your answers, and sorry for the delay. To be more 
clear on how we use instance types and templates, we use it like our external 
cloud providers use this kind of concepts: 

* Templates is a pre-provisionned OS (and maybe one or more application 
installed on top of it). Template needs storage space on storage domain(s) to 
be stored. 
* Instance types are "size" and other "metadata" applied to the template/VM 
(eg: CPUs/Cores count, RAM size, headless VM, SPICE, or VNC ?, scsi support ? 
HA ? ...). You can have a lot of "profiles" at "no cost" because it's just 
configuration 

IMHO, on our workload, as we have a limited set of templates but a lot of 
different "sizes/types" of VMs. If instead of using instance types + templates, 
we only use templates we will have a lot of templates with mostly the same 
OSes/application preinstalled on it and the maintenance/storage costs will be a 
lot greater than today. For some corner cases, we also have "non templated" VMs 
and we apply instance type to it too (we enforce that any VM, build from 
template or not on our clusters have an instance type applied to it) 

We are greatly interested in ansible modules to maintain and configure our 
multiple oVirt stacks. I know that you will not invest time in providing 
ansible module for instance types as you said that part of ovirt is in 
maintenance mode, but contribution are welcome on this part (I saw that the SDK 
already cover it) ? 

Have a nice day. 

Cheers. 



BQ_BEGIN


BQ_BEGIN


Best wishes, 
Greg 

BQ_BEGIN

Cheers. 

-- 
Baptiste 

[ovirt-devel] Re: Taking down the engine without setting global maintenance

2019-03-07 Thread Simone Tiraboschi
On Thu, Mar 7, 2019 at 10:34 AM Yedidyah Bar David  wrote:

> On Thu, Mar 7, 2019 at 11:30 AM Martin Sivak  wrote:
> >
> > Hi,
> >
> > there is no way to distinguish an engine that is not responsive
> > (software or network issue) from a VM that is being powered off. The
> > shutdown takes some time during which you just do not know.
>
> _I_ do not know, but the user might still know beforehand.
>
> > Global
> > maintenance informs the tooling in advance that something like this is
> > going to happen.
>
> Yes. But users keep forgetting setting it. So I am trying to come up
> with something that will fix that :-)
>

Now we have exactly the opposite:
engine-setup is already checking for global maintenance mode (the check
acts on the engine DB over what the hosts report when polled so we have a
bit of latency here) and engine-setup is exiting if we are on hosted-engine
and not in global maintenance mode.
https://github.com/oVirt/ovirt-engine/blob/master/packaging/setup/plugins/ovirt-engine-common/ovirt-engine/system/he.py#L49



>
> Perhaps instead of my original text, use something like "Right before
> the engine goes down, it should set global maintenance".
>
> >
> > Who do you expect should be touching the shared storage? The engine VM
> > itself? That might be possible, but remember the jboss instance is
> > just the top of the process hierarchy. There are a lot of components
> > where something might break during shutdown (filesystem umount timeout
> > for example).
>
> I did say "engine", not "engine vm". But see above for perhaps clearer
> text.
>
> >
> > Martin
> >
> > On Thu, Mar 7, 2019 at 9:27 AM Yedidyah Bar David 
> wrote:
> > >
> > > Hi all,
> > >
> > > How about making this change:
> > >
> > > Right before the engine goes down cleanly, it marks the shared storage
> > > saying it did not crash but exited cleanly, and then HE-HA will not
> > > try to restart it on another host. Perhaps make this optional, so that
> > > users can do clean shutdowns and still test HA cleanly (or some other
> > > use cases, where users might not want this).
> > >
> > > This should help a lot cases where people restarted their engine for
> > > some reason, e.g. upgrade, and forgot to set maintenance.
> > >
> > > Makes sense?
> > > --
> > > Didi
> > > ___
> > > Devel mailing list -- devel@ovirt.org
> > > To unsubscribe send an email to devel-le...@ovirt.org
> > > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> > > oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> > > List Archives:
> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WCLSLEVXPHGRHL5BJHPLSYWPPOCMIJOQ/
>
>
>
> --
> Didi
>
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/N3HRNZTWPJ4N5CGUX3WT4VFZUF65IZBS/


[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master ] [ 07-03-2019 ] [ / 004_basic_sanity.verify_glance_import]

2019-03-07 Thread Benny Zlotnik
This seems to be the issue: 2019-03-07 02:01:12,421-05 ERROR
[org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (default task-1)
[d2ac1504-3c2c-461a-adfb-332f5e4bf14f] Error during ValidateFailure.:
java.lang.IllegalArgumentException: MaxBlockDiskSizeInGibiBytes has no
value for version: general


On Thu, Mar 7, 2019 at 11:24 AM Galit Rosenthal  wrote:

> Hi,
>
> we are failing basic suite master ( ovirt-engine)
>
> It looks a problem in ovirt-engine
>
> Can you please have a look at the issue?
>
> Thanks,
> Galit
>
> Error [1]:
> 'NoneType' object has no attribute 'status'
>
>  >> begin captured logging << 
> ovirtlago.testlib: ERROR: * Unhandled exception in  at 
> 0x7f5e9feb3ed8>
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line 234, in 
> assert_equals_within
> res = func()
>   File 
> "/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
>  line 1153, in 
> lambda: api.disks.get(disk_name).status.state == 'ok',
> AttributeError: 'NoneType' object has no attribute 'status'
> - >> end captured logging << -
>
>
> US CQ of the ovirt-engine results can be found [2]
> We see the same error DS as well [3]
>
>
> [1]
> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/testReport/junit/(root)/004_basic_sanity/running_tests___basic_suite_el7_x86_64___verify_glance_import/
>
> [2]
> https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/
>
> [3]
> https://rhv-devops-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/blue/rest/organizations/jenkins/pipelines/rhv-master_change-queue-tester/runs/1054/nodes/1758/steps/1883/log/?start=0
> --
>
> GALIT ROSENTHAL
>
> SOFTWARE ENGINEER
>
> Red Hat
>
> 
>
> ga...@gmail.comT: 972-9-7692230
> 
> ___
> Devel mailing list -- devel@ovirt.org
> To unsubscribe send an email to devel-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/NUVPCGA3CD5NZTVCTJBLRAHII2I5XJ7O/
>
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/24S53CED2KZQS62EXOSSNP7T5DXWU7ZR/


[ovirt-devel] Re: Taking down the engine without setting global maintenance

2019-03-07 Thread Yedidyah Bar David
On Thu, Mar 7, 2019 at 11:30 AM Martin Sivak  wrote:
>
> Hi,
>
> there is no way to distinguish an engine that is not responsive
> (software or network issue) from a VM that is being powered off. The
> shutdown takes some time during which you just do not know.

_I_ do not know, but the user might still know beforehand.

> Global
> maintenance informs the tooling in advance that something like this is
> going to happen.

Yes. But users keep forgetting setting it. So I am trying to come up
with something that will fix that :-)

Perhaps instead of my original text, use something like "Right before
the engine goes down, it should set global maintenance".

>
> Who do you expect should be touching the shared storage? The engine VM
> itself? That might be possible, but remember the jboss instance is
> just the top of the process hierarchy. There are a lot of components
> where something might break during shutdown (filesystem umount timeout
> for example).

I did say "engine", not "engine vm". But see above for perhaps clearer
text.

>
> Martin
>
> On Thu, Mar 7, 2019 at 9:27 AM Yedidyah Bar David  wrote:
> >
> > Hi all,
> >
> > How about making this change:
> >
> > Right before the engine goes down cleanly, it marks the shared storage
> > saying it did not crash but exited cleanly, and then HE-HA will not
> > try to restart it on another host. Perhaps make this optional, so that
> > users can do clean shutdowns and still test HA cleanly (or some other
> > use cases, where users might not want this).
> >
> > This should help a lot cases where people restarted their engine for
> > some reason, e.g. upgrade, and forgot to set maintenance.
> >
> > Makes sense?
> > --
> > Didi
> > ___
> > Devel mailing list -- devel@ovirt.org
> > To unsubscribe send an email to devel-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> > oVirt Code of Conduct: 
> > https://www.ovirt.org/community/about/community-guidelines/
> > List Archives: 
> > https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WCLSLEVXPHGRHL5BJHPLSYWPPOCMIJOQ/



-- 
Didi
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/MOTZ7CW5GZC3RTQO3LMGIHT46WTFISZA/


[ovirt-devel] Re: Taking down the engine without setting global maintenance

2019-03-07 Thread Martin Sivak
Hi,

there is no way to distinguish an engine that is not responsive
(software or network issue) from a VM that is being powered off. The
shutdown takes some time during which you just do not know. Global
maintenance informs the tooling in advance that something like this is
going to happen.

Who do you expect should be touching the shared storage? The engine VM
itself? That might be possible, but remember the jboss instance is
just the top of the process hierarchy. There are a lot of components
where something might break during shutdown (filesystem umount timeout
for example).

Martin

On Thu, Mar 7, 2019 at 9:27 AM Yedidyah Bar David  wrote:
>
> Hi all,
>
> How about making this change:
>
> Right before the engine goes down cleanly, it marks the shared storage
> saying it did not crash but exited cleanly, and then HE-HA will not
> try to restart it on another host. Perhaps make this optional, so that
> users can do clean shutdowns and still test HA cleanly (or some other
> use cases, where users might not want this).
>
> This should help a lot cases where people restarted their engine for
> some reason, e.g. upgrade, and forgot to set maintenance.
>
> Makes sense?
> --
> Didi
> ___
> Devel mailing list -- devel@ovirt.org
> To unsubscribe send an email to devel-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WCLSLEVXPHGRHL5BJHPLSYWPPOCMIJOQ/
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/FO34NFXR5U43GRC3LU4PUAPEXRNSSKLN/


[ovirt-devel] [ OST Failure Report ] [ oVirt Master ] [ 07-03-2019 ] [ / 004_basic_sanity.verify_glance_import]

2019-03-07 Thread Galit Rosenthal
Hi,

we are failing basic suite master ( ovirt-engine)

It looks a problem in ovirt-engine

Can you please have a look at the issue?

Thanks,
Galit

Error [1]:
'NoneType' object has no attribute 'status'

 >> begin captured logging << 
ovirtlago.testlib: ERROR: * Unhandled exception in  at 0x7f5e9feb3ed8>
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirtlago/testlib.py", line
234, in assert_equals_within
res = func()
  File 
"/home/jenkins/workspace/ovirt-master_change-queue-tester/ovirt-system-tests/basic-suite-master/test-scenarios/004_basic_sanity.py",
line 1153, in 
lambda: api.disks.get(disk_name).status.state == 'ok',
AttributeError: 'NoneType' object has no attribute 'status'
- >> end captured logging << -


US CQ of the ovirt-engine results can be found [2]
We see the same error DS as well [3]


[1]
https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/testReport/junit/(root)/004_basic_sanity/running_tests___basic_suite_el7_x86_64___verify_glance_import/

[2]
https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-master_change-queue-tester/13295/

[3]
https://rhv-devops-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/blue/rest/organizations/jenkins/pipelines/rhv-master_change-queue-tester/runs/1054/nodes/1758/steps/1883/log/?start=0
-- 

GALIT ROSENTHAL

SOFTWARE ENGINEER

Red Hat



ga...@gmail.comT: 972-9-7692230

___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/NUVPCGA3CD5NZTVCTJBLRAHII2I5XJ7O/


[ovirt-devel] Taking down the engine without setting global maintenance

2019-03-07 Thread Yedidyah Bar David
Hi all,

How about making this change:

Right before the engine goes down cleanly, it marks the shared storage
saying it did not crash but exited cleanly, and then HE-HA will not
try to restart it on another host. Perhaps make this optional, so that
users can do clean shutdowns and still test HA cleanly (or some other
use cases, where users might not want this).

This should help a lot cases where people restarted their engine for
some reason, e.g. upgrade, and forgot to set maintenance.

Makes sense?
-- 
Didi
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WCLSLEVXPHGRHL5BJHPLSYWPPOCMIJOQ/


[ovirt-devel] Re: [ OST Failure Report ] [ oVirt Master ] [ 07-03-2019 ] [ 008_basic_ui_sanity.initialize_firefox]

2019-03-07 Thread Marcin Sobczyk

Hi,

none of my UI-sanity-tests-optimization patches was merged so this is 
not related to my recent work.


Regards, Marcin

On 3/7/19 8:48 AM, Galit Rosenthal wrote:

Hi,

we are failing basic suite master (vdsm)

Can you please have a look at the issue?

*Error Message:*
Message: Error forwarding the new session new session request for 
webdriver should contain a location header or an 
'application/json;charset=UTF-8' response body with the session ID.

*from the [2]:*
*08:22:04* + xmllint --format 
/dev/shm/ost/deployment-basic-suite-master/008_basic_ui_sanity.py.junit.xml 
*08:22:04*  *08:22:04* 
skip="0"> *08:22:04* name="init" time="0.001"/> *08:22:04* classname="008_basic_ui_sanity" name="start_grid" time="144.413"> 
*08:22:04*