Re: zfs deadlock on r360452 relating to busy vm page

2020-05-14 Thread Mark Johnston
On Thu, May 14, 2020 at 02:16:15PM +0300, Andriy Gapon wrote:
> On 13/05/2020 17:42, Mark Johnston wrote:
> > On Wed, May 13, 2020 at 10:45:24AM +0300, Andriy Gapon wrote:
> >> On 13/05/2020 10:35, Andriy Gapon wrote:
> >>> In r329363 I re-worked zfs_getpages and introduced range locking to it.
> >>> At the time I believed that it was safe and maybe it was, please see the 
> >>> commit
> >>> message.
> >>> There, indeed, have been many performance / concurrency improvements to 
> >>> the VM
> >>> system and r358443 is one of them.
> >>
> >> Thinking more about it, it could be r352176.
> >> I think that vm_page_grab_valid (and later vm_page_grab_valid_unlocked) 
> >> are not
> >> equivalent to the code that they replaced.
> >> The original code would check valid field before any locking and it would
> >> attempt any locking / busing if a page is invalid.  The object was 
> >> required to
> >> be locked though.
> >> The new code tries to busy the page in any case.
> >>
> >>> I am not sure how to resolve the problem best.  Maybe someone who knows 
> >>> the
> >>> latest VM code better than me can comment on my assumptions stated in the 
> >>> commit
> >>> message.
> > 
> > The general trend has been to use the page busy lock as the single point
> > of synchronization for per-page state.  As you noted, updates to the
> > valid bits were previously interlocked by the object lock, but this is
> > coarse-grained and hurts concurrency.  I think you are right that the
> > range locking in getpages was ok before the recent change, but it seems
> > preferable to try and address this in ZFS.
> > 
> >>> In illumos (and, I think, in OpenZFS/ZoL) they don't have the range 
> >>> locking in
> >>> this corner of the code because of a similar deadlock a long time ago.
> > 
> > Do they just not implement readahead?
> 
> I think so, but not 100% sure.
> I recall seeing a comment in illumos code that they do not care about 
> read-ahead
> because there is ZFS prefetch and the data will be cached in ARC.  That makes
> sense from the I/O point of view, but it does not help with page faults.
> 
> > Can you explain exactly what the
> > range lock accomplishes here - is it entirely to ensure that znode block
> > size remains stable?
> 
> As far as I can recall, this is the reason indeed.

It seems to me that zfs_getpages() could use a non-blocking
rangelock_enter() operation to avoid the deadlock.  The ZFS rangelock
implementation doesn't have one, but it is easy to add.  I'm not able to
trigger the deadlock with this patch: https://reviews.freebsd.org/D24839
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PCIe NVME drives not detected on Dell R6515

2020-05-14 Thread Chuck Tuffli
On Mon, May 4, 2020 at 11:12 AM Miroslav Lachman <000.f...@quip.cz> wrote:
>
> On 2020-04-27 08:02, Miroslav Lachman wrote:
> > I don't know what is with Scott. I hope he is well.
> > Is there somebody else who can help me with this issue?
> > Scott wrote there are hotplug PCIe buses not probed during boot process.
> > I am not a developer so I cannot move forward alone.
>
> The problem is with PCIe Hot Plug.
> Hot Plug bus was not enumerated thus no NVME detected.

I may have just been bitten by this as well when running FreeBSD under
qemu. The q35 machine type with PCIe emulation enables PCIe hot plug
on all the root ports, but I am not seeing any downstream devices
(either emulated like e1000 or passed through by the host) because of
a check in pcib_hotplug_present():
/*
 * Require the Electromechanical Interlock to be engaged if
 * present.
 */
if (sc->pcie_slot_cap & PCIEM_SLOT_CAP_EIP &&
(sc->pcie_slot_sta & PCIEM_SLOT_STA_EIS) == 0)
return (0);

Under qemu, the slot indicates an Electromechanical Interlock is
Present in the capabilities register, but it does not set the
Electromechanical Interlock Status bit. This causes the PCI driver to
not probe any children. Commenting out the above code made both
emulated PCIe devices as well as host devices passed through appear in
FreeBSD. As a data point, I'm not seeing similar checks in the Linux
kernel.

Miroslav, would it be possible to comment out/delete the above code in
your kernel and retest to see if that helps your case as well?

--chuck
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfs deadlock on r360452 relating to busy vm page

2020-05-14 Thread Andriy Gapon
On 13/05/2020 17:42, Mark Johnston wrote:
> On Wed, May 13, 2020 at 10:45:24AM +0300, Andriy Gapon wrote:
>> On 13/05/2020 10:35, Andriy Gapon wrote:
>>> In r329363 I re-worked zfs_getpages and introduced range locking to it.
>>> At the time I believed that it was safe and maybe it was, please see the 
>>> commit
>>> message.
>>> There, indeed, have been many performance / concurrency improvements to the 
>>> VM
>>> system and r358443 is one of them.
>>
>> Thinking more about it, it could be r352176.
>> I think that vm_page_grab_valid (and later vm_page_grab_valid_unlocked) are 
>> not
>> equivalent to the code that they replaced.
>> The original code would check valid field before any locking and it would
>> attempt any locking / busing if a page is invalid.  The object was required 
>> to
>> be locked though.
>> The new code tries to busy the page in any case.
>>
>>> I am not sure how to resolve the problem best.  Maybe someone who knows the
>>> latest VM code better than me can comment on my assumptions stated in the 
>>> commit
>>> message.
> 
> The general trend has been to use the page busy lock as the single point
> of synchronization for per-page state.  As you noted, updates to the
> valid bits were previously interlocked by the object lock, but this is
> coarse-grained and hurts concurrency.  I think you are right that the
> range locking in getpages was ok before the recent change, but it seems
> preferable to try and address this in ZFS.
> 
>>> In illumos (and, I think, in OpenZFS/ZoL) they don't have the range locking 
>>> in
>>> this corner of the code because of a similar deadlock a long time ago.
> 
> Do they just not implement readahead?

I think so, but not 100% sure.
I recall seeing a comment in illumos code that they do not care about read-ahead
because there is ZFS prefetch and the data will be cached in ARC.  That makes
sense from the I/O point of view, but it does not help with page faults.

> Can you explain exactly what the
> range lock accomplishes here - is it entirely to ensure that znode block
> size remains stable?

As far as I can recall, this is the reason indeed.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"