On 02/07/2019 09:02 PM, Robin Murphy wrote:
> On 07/02/2019 13:36, Oscar Salvador wrote:
>> On Wed, Feb 06, 2019 at 05:03:53PM +0000, Robin Murphy wrote:
>>> ARCH_MEMORY_PROBE is a useful thing for testing and debugging hotplug,
>>> but being able to exercise the (arguably trickier) hot-remove path would
>>> be even more useful. Extend the feature to allow removal of offline
>>> sections to be triggered manually to aid development.
>>>
>>> Signed-off-by: Robin Murphy <robin.mur...@arm.com>
>>> ---
>>>
>>> This is inspired by a previous proposal[1], but in coming up with a
>>> more robust interface I ended up rewriting the whole thing from
>>> scratch. The lack of documentation is semi-deliberate, since I don't
>>> like the idea of anyone actually relying on this interface as ABI, but
>>> as a handy tool it felt useful enough to be worth sharing :)
>>
>> Hi Robin,
>>
>> I think this might come in handy, especially when trying to test hot-remove
>> on arch's that do not have any means to hot-remove memory, or even on virtual
>> platforms that do not have yet support for hot-remove depending on the 
>> platform,
>> like qemu/arm64.
>>
>>
>> I could have used this while testing hot-remove on other archs for [1]
>>
>>>
>>> Robin.
>>>
>>> [1] 
>>> https://lore.kernel.org/lkml/22d34fe30df0fbacbfceeb47e20cb1184af73585.1511433386.git...@linux.vnet.ibm.com/
>>>
>>
>>> +    if (mem->state != MEM_OFFLINE)
>>> +        return -EBUSY;
>>
>> We do have the helper "is_memblock_offlined()", although it is only used in 
>> one place now.
>> So, I would rather use it here as well.
> 
> Ooh, if I'd actually noticed that that helper existed, I would indeed have 
> used it - fixed.
> 
>>> +
>>> +    ret = lock_device_hotplug_sysfs();
>>> +    if (ret)
>>> +        return ret;
>>> +
>>> +    if (device_remove_file_self(dev, attr)) {
>>> +        __remove_memory(pfn_to_nid(start_pfn), PFN_PHYS(start_pfn),
>>> +                MIN_MEMORY_BLOCK_SIZE * sections_per_block);
>>
>> Sorry, I am not into sysfs inners, but I thought that:
>> device_del::device_remove_attrs::device_remove_groups::sysfs_remove_groups
>> would be enough to remove the dev attributes.
>> I guess in this case that is not enough, could you explain why?
> 
> As I found out the hard way, since the "remove" attribute itself belongs to 
> the device being removed, the standard device teardown callchain would end up 
> trying to remove the file from its own method, which results in deadlock. 
> Fortunately, the PCI sysfs code has a similar "remove" attribute which showed 
> me how it should be handled - following the kerneldoc breadcrumb trail to 
> kernfs_remove_self() hopefully explains it more completely.
Instead we could have an interface like 
/sys/devices/system/memory/[unprobe|remove]
which would take a memory block start address looking into the existing ones and
attempt to remove [addr, addr + memory_block_size] provided its already 
offlined.
This will be exact opposite for /sys/devices/system/memory/probe except the fact
that it can trigger onlining of the memory automatically (even this new one 
could
trigger offlining automatically as well). But I dont have a preference between 
the
proposed one or this one. Either of them should be okay.

Reply via email to