Re: [linux-lvm] Discussion: performance issue on event activation mode

2021-06-09 Thread Heming Zhao
Either my mailbox or lvm mail list is broken, I can't see my last two mails 
appear in the mail list.

This mail I will mention another issue about lvm2-pvscan@.service.
both event activation and direct activation have same issue:
the shutdown take much time.

the code logic in pvscan_cache_cmd() only takes effect on activation job:
```
    if (do_activate &&
        !find_config_tree_bool(cmd, global_event_activation_CFG, NULL)) {
        log_verbose("Ignoring pvscan --cache -aay because event_activation is 
disabled.");
        return ECMD_PROCESSED;
    }
```

and I have a question about the script  lvm2-pvscan@.service:
why there also does a scan job when stopping? could we remove/modify this line?
  ```
  ExecStop=@SBINDIR@/lvm pvscan --cache %i
  ```

Regards
heming

On 6/9/21 12:18 AM, heming.z...@suse.com wrote:
> On 6/8/21 5:30 AM, David Teigland wrote:
>> On Mon, Jun 07, 2021 at 10:27:20AM +, Martin Wilck wrote:
>>> Most importantly, this was about LVM2 scanning of physical volumes. The
>>> number of udev workers has very little influence on PV scanning,
>>> because the udev rules only activate systemd service. The actual
>>> scanning takes place in lvm2-pvscan@.service. And unlike udev, there's
>>> no limit for the number of instances of a given systemd service
>>> template that can run at any given time.
>>
>> Excessive device scanning has been the historical problem in this area,
>> but Heming mentioned dev_cache_scan() specifically as a problem.  That was
>> surprising to me since it doesn't scan/read devices, it just creates a
>> list of device names on the system (either readdir in /dev or udev
>> listing.)  If there are still problems with excessive scannning/reading,
>> we'll need some more diagnosis of what's happening, there could be some
>> cases we've missed.
>>
> 
> the dev_cache_scan doesn't have direct disk IOs, but libudev will scan/read
> udev db which issue real disk IOs (location is /run/udev/data).
> we can see with combination "obtain_device_list_from_udev=0 &
> event_activation=1" could largely reduce booting time from 2min6s to 40s.
> the key is dev_cache_scan() does the scan device by itself (scaning "/dev").
> 
> I am not very familiar with systemd-udev, below shows a little more info
> about libudev path. the top function is _insert_udev_dir, this function:
> 1. scans/reads /sys/class/block/. O(n)
> 2. scans/reads udev db (/run/udev/data). may O(n)
>    udev will call device_read_db => handle_db_line to handle every
>    line of a db file.
> 3. does qsort & deduplication the devices list.  O(n) + O(n)
> 4. has lots of "memory alloc" & "string copy" actions during working.
>    it takes too much memory, from the host side, use 'top' can see:
>    - direct activation only used 2G memory during boot
>    - event activation cost ~20G memory.
> 
> I didn't test the related udev code, and guess the <2> takes too much time.
> And there are thousand scanning job parallel in /run/udev/data, meanwhile
> there are many devices need to generate udev db file in the same dir. I am
> not sure if the filesystem can perfect handle this scenario.
> the another code path, obtain_device_list_from_udev=0, which triggers to
> scan/read "/dev", this dir has less write IOs than /run/udev/data.
> 
> Regards
> heming



___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/



Re: [linux-lvm] Discussion: performance issue on event activation mode

2021-06-09 Thread heming.z...@suse.com

On 6/9/21 12:18 AM, heming.z...@suse.com wrote:

On 6/8/21 5:30 AM, David Teigland wrote:

On Mon, Jun 07, 2021 at 10:27:20AM +, Martin Wilck wrote:

Most importantly, this was about LVM2 scanning of physical volumes. The
number of udev workers has very little influence on PV scanning,
because the udev rules only activate systemd service. The actual
scanning takes place in lvm2-pvscan@.service. And unlike udev, there's
no limit for the number of instances of a given systemd service
template that can run at any given time.


Excessive device scanning has been the historical problem in this area,
but Heming mentioned dev_cache_scan() specifically as a problem.  That was
surprising to me since it doesn't scan/read devices, it just creates a
list of device names on the system (either readdir in /dev or udev
listing.)  If there are still problems with excessive scannning/reading,
we'll need some more diagnosis of what's happening, there could be some
cases we've missed.



the dev_cache_scan doesn't have direct disk IOs, but libudev will scan/read
udev db which issue real disk IOs (location is /run/udev/data).
we can see with combination "obtain_device_list_from_udev=0 &
event_activation=1" could largely reduce booting time from 2min6s to 40s.
the key is dev_cache_scan() does the scan device by itself (scaning "/dev").

I am not very familiar with systemd-udev, below shows a little more info
about libudev path. the top function is _insert_udev_dir, this function:
1. scans/reads /sys/class/block/. O(n)
2. scans/reads udev db (/run/udev/data). may O(n)
   udev will call device_read_db => handle_db_line to handle every
   line of a db file.
3. does qsort & deduplication the devices list.  O(n) + O(n)
4. has lots of "memory alloc" & "string copy" actions during working.
   it takes too much memory, from the host side, use 'top' can see:
   - direct activation only used 2G memory during boot
   - event activation cost ~20G memory.

I didn't test the related udev code, and guess the <2> takes too much time.
And there are thousand scanning job parallel in /run/udev/data, meanwhile
there are many devices need to generate udev db file in the same dir. I am
not sure if the filesystem can perfect handle this scenario.
the another code path, obtain_device_list_from_udev=0, which triggers to
scan/read "/dev", this dir has less write IOs than /run/udev/data.

Regards
heming


I made a minor mistake: above <3> qsort time is O(logn).

More info about my analysis:
I set filter in lvm.conf, the rule: filter = [ "a|/dev/vda2|", "r|.*|" ]
the booting time reduced a little, from 2min 6s to 1min 42s.

The vm vda2 layout:
# lsblk | egrep -A 4 "^vd"
vda 253:0 0  40G  0 disk
├─vda1  253:1 0   8M  0 part
└─vda2  253:2 0  40G  0 part
  ├─system-swap 254:0 0   2G  0 lvm  [SWAP]
  └─system-root 254:1 0  35G  0 lvm  /

the filter rule denies all the LVs except rootfs LVs.

the rule makes _pvscan_cache_args() to remove dev from devl->list by nodata 
filters.
the hot spot narrow to setup_devices (calling dev_cache_scan()).



___
linux-lvm mailing list
linux-lvm@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/