Hi,

On 9/29/19 12:54 AM, Nir Soffer wrote:
On Sat, Sep 28, 2019 at 11:04 PM Rik Theys <rik.th...@esat.kuleuven.be <mailto:rik.th...@esat.kuleuven.be>> wrote:

    Hi Nir,

    Thank you for your time.

    On 9/27/19 4:27 PM, Nir Soffer wrote:


    On Fri, Sep 27, 2019, 12:37 Rik Theys <rik.th...@esat.kuleuven.be
    <mailto:rik.th...@esat.kuleuven.be>> wrote:

        Hi,

        After upgrading to 4.3.6, my storage domain can no longer be
        activated, rendering my data center useless.

        My storage domain is local storage on a filesystem backed by
        VDO/LVM. It seems 4.3.6 has added support for 4k storage.
        My VDO does not have the 'emulate512' flag set.


    This configuration is not supported before 4.3.6. Various
    operations may fail when
    reading or writing to storage.
    I was not aware of this when I set it up as I did not expect this
    to influence a setup where oVirt uses local storage (a file system
    location).

    4.3.6 detects storageblock size, creates compatible storage
    domain metadata, and
    consider the block size when accessing storage.

        I've tried downgrading all packages on the host to the
        previous versions (with ioprocess 1.2), but this does not
        seem to make any difference.


    Downgrading should solve your issue, but without any logs we only
    guess.

    I was able to work around my issue by downgrading to ioprocess 1.1
    (and vdsm-4.30.24). Downgrading to only 1.2 did not solve my
    issue. With ioprocess downgraded to 1.1, I did not have to
    downgrade the engine (still on 4.3.6).

ioprocess 1.1. is not recommended, you really want to use 1.3.0.

    I think I now have a better understanding what happened that
    triggered this.

    During a nightly yum-cron, the ioprocess and vdsm packages on the
    host were upgraded to 1.3 and vdsm 4.30.33. At this point, the
    engine log started to log:

    2019-09-27 03:40:27,472+02 INFO
    [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand]
    (EE-ManagedThreadFactory-engine-Thread-384418) [695f38cc]
    Executing with domain map: {6bdf1a0d-274b-4195-8f
    f5-a5c002ea1a77=active}
    2019-09-27 03:40:27,646+02 WARN
    [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand]
    (EE-ManagedThreadFactory-engine-Thread-384418) [695f38cc]
    Unexpected return value: Status [code=348, message=Block size does
    not match storage block size: 'block_size=512,
    storage_block_size=4096']

This means that when activating the storage domain, vdsm detected that the storage block size
is 4k, but the domain metadata reports block size of 512.

This combination may partly work for localfs domain since we don't use sanlock with local storage, and vdsm does not use direct I/O when writing to storage, and always use 4k block size when
reading metadata from storage.

Note that with older ovirt-imageio < 1.5.2, image uploads and downloads may fail when using 4k storage.
in recent ovirt-imageio we detect and use the correct block size.

    2019-09-27 03:40:27,646+02 INFO
    [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand]
    (EE-ManagedThreadFactory-engine-Thread-384418) [695f38cc] FINISH,
    ConnectStoragePoolVDSCommand, return: , log id: 483c7a17

    I did not notice at first that this was a storage related issue
    and assumed it may get resolved by also upgrading the engine. So
    in the morning I upgraded the engine to 4.3.6 but this did not
    resolve my issue.

    I then found the above error in the engine log. In the release
    notes of 4.3.6 I read about the 4k support.

    I then downgraded ioprocess (and vdsm) to ioprocess 1.2 but that
    did also not solve my issue. This is when I contacted the list
    with my question.

    Afterwards I found in the ioprocess rpm changelog that (partial?)
    4k support was also in 1.2. I kept on downgrading until I got
    ioprocess 1.1 (without 4k support) and at this point I could
    re-attach my storage domain.

    You mention above that 4.3.6 will detect the block size and
    configure the metadata on the storage domain? I've checked the
    dom_md/metadata file and it shows:

    ALIGNMENT=1048576
    *BLOCK_SIZE=512*
    CLASS=Data
    DESCRIPTION=studvirt1-Local
    IOOPTIMEOUTSEC=10
    LEASERETRIES=3
    LEASETIMESEC=60
    LOCKPOLICY=
    LOCKRENEWALINTERVALSEC=5
    MASTER_VERSION=1
    POOL_DESCRIPTION=studvirt1-Local
    POOL_DOMAINS=6bdf1a0d-274b-4195-8ff5-a5c002ea1a77:Active
    POOL_SPM_ID=-1
    POOL_SPM_LVER=-1
    POOL_UUID=085f02e8-c3b4-4cef-a35c-e357a86eec0c
    REMOTE_PATH=/data/images
    ROLE=Master
    SDUUID=6bdf1a0d-274b-4195-8ff5-a5c002ea1a77
    TYPE=LOCALFS
    VERSION=5
    _SHA_CKSUM=9dde06bbc9f2316efc141565738ff32037b1ff66

So you have a v5 localfs storage domain - because we don't use leases, this domain should work
with 4.3.6 if you modify this line in the domain metadata.

BLOCK_SIZE=4096

To modify the line, you have to delete the checksum:

_SHA_CKSUM=9dde06bbc9f2316efc141565738ff32037b1ff66

    I assume that at this point it works because ioprocess 1.1 does
    not report the block size to the engine (as it doesn't support
    this option?)?

I think it works because ioprocess 1.1 has a bug when it does not use direct I/O when writing
files. This full vdsm to believe you have block size of 512 bytes.

    Can I update the storage domain metadata manually to report 4096
    instead?

    I also noticed that the storage_domain_static table has the
    block_size stored. Should I update this field at the same time as
    I update the metadata file?

Yes, I think it should work.

    If the engine log and database dump is still needed to better
    understand the issue, I will send it on Monday.

Engine reports the block size reported by vdsm. Once we get the system up with your 4k storage domain,
we can check that engine reports the right value and update it if needed.
I think what you should do is:

1. Backup storage domain metadata
/path/to/domain/domain-uuid/dom_md

2. Deactivate the storage domain (from engine)

3. Edit the metadata file:
- change BLOCK_SIZE to 4096
- delete the checksum line (_SHA_CKSUM=9dde06bbc9f2316efc141565738ff32037b1ff66)

4. Activate the domain

With vdsm < 4.3.6, the domain should be active, ignoring the block size.

5. Upgrade back to 4.3.6

The system should detect the block size and work normally.

6. File ovirt bug for this issue

We need at least to document the way to fix the storage domain manually.

We also should consider checking storage domain metadata during upgrades. I think it will be a better experience if upgrade will fail and you have a working system with older version.


I've tried this procedure and it has worked! Thanks!

If you would like me to file a bug, which component should I log it against?

Regards,

Rik


        Should I also downgrade the engine to 4.3.5 to get this to
        work again. I expected the downgrade of the host to be
        sufficient.

        As an alternative I guess I could enable the emulate512 flag
        on VDO but I can not find how to do this on an existing VDO
        volume. Is this possible?


    Please share more data so we can understand the failure:

    - complete vdsm log showing the failure to activate the domain
      - with 4.3.6
      - with 4.3.5 (after you downgraded
    - contents of
    /rhev/data-center/mnt/_<domaindir>/domain-uuid/dom_md/metadata
      (assuming your local domain mount is /domaindir)
    - engine db dump

    Nir


        Regards,
        Rik


        On 9/26/19 4:58 PM, Sandro Bonazzola wrote:

        The oVirt Project is pleased to announce the general
        availability of oVirt 4.3.6 as of September 26th, 2019.

        This update is the sixth in a series of stabilization
        updates to the 4.3 series.

        This release is available now on x86_64 architecture for:

        * Red Hat Enterprise Linux 7.7 or later (but < 8)

        * CentOS Linux (or similar) 7.7 or later (but < 8)

        This release supports Hypervisor Hosts on x86_64 and ppc64le
        architectures for:

        * Red Hat Enterprise Linux 7.7 or later (but < 8)

        * CentOS Linux (or similar) 7.7 or later (but < 8)

        * oVirt Node 4.3 (available for x86_64 only)

        Due to Fedora 28 being now at end of life this release is
        missing experimental tech preview for x86_64 and s390x
        architectures for Fedora 28.

        We are working on Fedora 29 and 30 support and we may
        re-introduce experimental support for Fedora in next release.

        See the release notes [1] for installation / upgrade
        instructions and a list of new features and bugs fixed.

        Notes:

        - oVirt Appliance is already available

        - oVirt Node is already available[2]


        oVirt Node and Appliance have been updated including:

        - oVirt4.3.6: http://www.ovirt.org/release/4.3.6/

        - Wildfly17.0.1:
        https://wildfly.org/news/2019/07/07/WildFly-1701-Released/

        - Latest CentOS 7.7updates including:

         *

            Release for CentOS Linux 7 (1908) on the x86_64
            Architecture
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023405.html>

         *

            CEBA-2019:2601 CentOS 7 NetworkManager BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023423.html>

         *

            CEBA-2019:2023 CentOS 7 efivar BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023445.html>

         *

            CEBA-2019:2614 CentOS 7 firewalld BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023412.html>

         *

            CEBA-2019:2227 CentOS 7 grubby BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023441.html>

         *

            CESA-2019:2258 Moderate CentOS 7 http-parser Security
            Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023439.html>

         *

            CESA-2019:2600 Important CentOS 7 kernel Security Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023444.html>

         *

            CEBA-2019:2599 CentOS 7 krb5 BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023420.html>

         *

            CEBA-2019:2358 CentOS 7 libguestfs BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023421.html>

         *

            CEBA-2019:2679 CentOS 7 libvirt BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023422.html>

         *

            CEBA-2019:2501 CentOS 7 rsyslog BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023431.html>

         *

            CEBA-2019:2355 CentOS 7 selinux-policy BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023432.html>

         *

            CEBA-2019:2612 CentOS 7 sg3_utils BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023433.html>

         *

            CEBA-2019:2602 CentOS 7 sos BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023434.html>

         *

            CEBA-2019:2564 CentOS 7 subscription-manager BugFix
            Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023435.html>

         *

            CEBA-2019:2356 CentOS 7 systemd BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023436.html>

         *

            CEBA-2019:2605 CentOS 7 tuned BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023437.html>

         *

            CEBA-2019:2871 CentOS 7 tzdata BugFix Update
            
<https://lists.centos.org/pipermail/centos-announce/2019-September/023450.html>

        - latest CentOS Virt and Storage SIG updates:

         *

            Ansible2.8.5:
            
https://github.com/ansible/ansible/blob/stable-2.8/changelogs/CHANGELOG-v2.8.rst#v2-8-5

         *

            Glusterfs6.5:
            https://docs.gluster.org/en/latest/release-notes/6.5/

         *

            QEMU KVM EV2.12.0-33.1 :
            https://cbs.centos.org/koji/buildinfo?buildID=26484


        Given the amount of security fixes provided by this release,
        upgrade is recommended as soon as practical.


        Additional Resources:

        * Read more about the oVirt 4.3.6 release
        highlights:http://www.ovirt.org/release/4.3.6/

        * Get more oVirt Project updates on Twitter:
        https://twitter.com/ovirt

        * Check out the latest project news on the oVirt
        blog:http://www.ovirt.org/blog/

        [1] http://www.ovirt.org/release/4.3.6/

        [2] http://resources.ovirt.org/pub/ovirt-4.3/iso/

--
        Sandro Bonazzola

        MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV

        Red Hat EMEA <https://www.redhat.com/>

        sbona...@redhat.com <mailto:sbona...@redhat.com>

        <https://www.redhat.com/>

        *Red Hat respects your work life balance. Therefore there is
        no need to answer this email out of your office hours.
        <https://mojo.redhat.com/docs/DOC-1199578>*

        _______________________________________________
        Users mailing list --users@ovirt.org  <mailto:users@ovirt.org>
        To unsubscribe send an email tousers-le...@ovirt.org  
<mailto:users-le...@ovirt.org>
        Privacy Statement:https://www.ovirt.org/site/privacy-policy/
        oVirt Code of 
Conduct:https://www.ovirt.org/community/about/community-guidelines/
        List 
Archives:https://lists.ovirt.org/archives/list/users@ovirt.org/message/AY66CEQHHYOVBWAQQYYSPEG5DXEIUAAT/


-- Rik Theys
        System Engineer
        KU Leuven - Dept. Elektrotechniek (ESAT)
        Kasteelpark Arenberg 10 bus 2440  - B-3001 Leuven-Heverlee
        +32(0)16/32.11.07
        ----------------------------------------------------------------
        <<Any errors in spelling, tact or fact are transmission errors>>

        _______________________________________________
        Users mailing list -- users@ovirt.org <mailto:users@ovirt.org>
        To unsubscribe send an email to users-le...@ovirt.org
        <mailto:users-le...@ovirt.org>
        Privacy Statement: https://www.ovirt.org/site/privacy-policy/
        oVirt Code of Conduct:
        https://www.ovirt.org/community/about/community-guidelines/
        List Archives:
        
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JPIYWV2OUNLNHUY6EU7YZI2RYFW2SW5L/

-- Rik Theys
    System Engineer
    KU Leuven - Dept. Elektrotechniek (ESAT)
    Kasteelpark Arenberg 10 bus 2440  - B-3001 Leuven-Heverlee
    +32(0)16/32.11.07
    ----------------------------------------------------------------
    <<Any errors in spelling, tact or fact are transmission errors>>


--
Rik Theys
System Engineer
KU Leuven - Dept. Elektrotechniek (ESAT)
Kasteelpark Arenberg 10 bus 2440  - B-3001 Leuven-Heverlee
+32(0)16/32.11.07
----------------------------------------------------------------
<<Any errors in spelling, tact or fact are transmission errors>>

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/R2BHFSOUOUHVI263HA7HB4OURNZK3AWS/

Reply via email to