On Fri, Nov 30, 2012 at 03:08:02AM +0000, Liu, Jinsong wrote: > Konrad Rzeszutek Wilk wrote: > > On Wed, Nov 21, 2012 at 11:45:04AM +0000, Liu, Jinsong wrote: > >>> From 630c65690c878255ce71e7c1172338ed08709273 Mon Sep 17 00:00:00 > >>> 2001 > >> From: Liu Jinsong <jinsong....@intel.com> > >> Date: Tue, 20 Nov 2012 21:14:37 +0800 > >> Subject: [PATCH 1/2] Xen acpi memory hotplug driver > >> > >> Xen acpi memory hotplug consists of 2 logic components: > >> Xen acpi memory hotplug driver and Xen hypercall. > >> > >> This patch implement Xen acpi memory hotplug driver. When running > >> under xen platform, Xen driver will early occupy (so native driver > > > > How will it 'early occupy'? Can you spell it out here please? > > Sure, will add it like > 'When running under xen platform, at booting stage xen memory hotplug driver > will early occupy via subsys_initcall (earlier than native module_init), so > xen driver will take effect and native driver will be blocked'.
OK. > > > > >> will be blocked). When acpi memory notify OSPM, xen driver will take > >> effect, adding related memory device and parsing memory information. > >> > >> Signed-off-by: Liu Jinsong <jinsong....@intel.com> > >> --- > >> drivers/xen/Kconfig | 11 + > >> drivers/xen/Makefile | 1 + > >> drivers/xen/xen-acpi-memhotplug.c | 383 > >> +++++++++++++++++++++++++++++++++++++ 3 files changed, 395 > >> insertions(+), 0 deletions(-) create mode 100644 > >> drivers/xen/xen-acpi-memhotplug.c > >> > >> diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig > >> index 126d8ce..abd0396 100644 > >> --- a/drivers/xen/Kconfig > >> +++ b/drivers/xen/Kconfig > >> @@ -206,4 +206,15 @@ config XEN_MCE_LOG > >> Allow kernel fetching MCE error from Xen platform and > >> converting it into Linux mcelog format for mcelog tools > >> > >> +config XEN_ACPI_MEMORY_HOTPLUG > >> + bool "Xen ACPI memory hotplug" > > > > There should be a way to make this a module. > > I have some concerns to make it a module: > 1. xen and native memhotplug driver both work as module, while we need early > load xen driver. > 2. if possible, a xen stub driver may solve load sequence issue, but it may > involve other issues > * if xen driver load then unload, native driver may have chance to load > successfully; The stub driver would still "occupy" the ACPI bus for the memory hotplug PnP, so I think this would not be a problem. > * if xen driver load --> unload --> load again, then it will lose hotplug > notification during unload period; Sure. But I think we can do it with this driver? After all the function of it is to just tell the firmware to turn on/off sockets - and if we miss one notification we won't take advantage of the power savings - but we can do that later on. > * if xen driver load --> unload --> load again, then it will re-add all > memory devices, but the handle for 'booting memory device' and 'hotplug > memory device' are different while we have no way to distinguish these 2 kind > of devices. Wouldn't the stub driver hold onto that? > > IMHO I think to make xen hotplug logic as module may involves unexpected > result. Is there any obvious advantages of doing so? after all we have > provided config choice to user. Thoughts? Yes, it becomes a module - which is what we want. > > > > > > >> + depends on XEN_DOM0 && X86_64 && ACPI > >> + default n > >> + help > >> + This is Xen acpi memory hotplug. > > ^^^^ -> ACPI > > > >> + > >> + Currently Xen only support acpi memory hot-add. If you want > > ^^^^-> ACPI > > > >> + to hot-add memory at runtime (the hot-added memory cannot be > >> + removed until machine stop), select Y here, otherwise select N. + > >> endmenu > >> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile > >> index 7435470..c339eb4 100644 > >> --- a/drivers/xen/Makefile > >> +++ b/drivers/xen/Makefile > >> @@ -30,6 +30,7 @@ obj-$(CONFIG_XEN_MCE_LOG) += mcelog.o > >> obj-$(CONFIG_XEN_PCIDEV_BACKEND) += xen-pciback/ > >> obj-$(CONFIG_XEN_PRIVCMD) += xen-privcmd.o > >> obj-$(CONFIG_XEN_ACPI_PROCESSOR) += xen-acpi-processor.o > >> +obj-$(CONFIG_XEN_ACPI_MEMORY_HOTPLUG) += xen-acpi-memhotplug.o > >> xen-evtchn-y := evtchn.o xen-gntdev-y > >> := gntdev.o > >> xen-gntalloc-y := gntalloc.o > >> diff --git a/drivers/xen/xen-acpi-memhotplug.c > >> b/drivers/xen/xen-acpi-memhotplug.c new file mode 100644 index > >> 0000000..f0c7990 --- /dev/null > >> +++ b/drivers/xen/xen-acpi-memhotplug.c > >> @@ -0,0 +1,383 @@ > >> +/* > >> + * Copyright (C) 2012 Intel Corporation > >> + * Author: Liu Jinsong <jinsong....@intel.com> > >> + * Author: Jiang Yunhong <yunhong.ji...@intel.com> + * > >> + * This program is free software; you can redistribute it and/or > >> modify + * it under the terms of the GNU General Public License as > >> published by + * the Free Software Foundation; either version 2 of > >> the License, or (at + * your option) any later version. > >> + * > >> + * This program is distributed in the hope that it will be useful, > >> but + * WITHOUT ANY WARRANTY; without even the implied warranty of > >> + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE > >> or + * NON INFRINGEMENT. See the GNU General Public License for > >> more + * details. + */ > >> + > >> +#include <linux/kernel.h> > >> +#include <linux/init.h> > >> +#include <linux/types.h> > >> +#include <acpi/acpi_drivers.h> > >> + > >> +#define ACPI_MEMORY_DEVICE_CLASS "memory" > >> +#define ACPI_MEMORY_DEVICE_HID "PNP0C80" > >> +#define ACPI_MEMORY_DEVICE_NAME "Hotplug Mem Device" > > > > Weird tabs? > > > > It ported from native and seems right tabs? will double check. > > >> + > >> +#undef PREFIX > > > > Why the #undef ? > >> +#define PREFIX "ACPI:memory_hp:" > > > > > > Not "ACPI:memory_xen:" ? > > OK, how about more detailed "ACPI:xen_memory_hotplug:"? Sure. > > > > > > >> + > >> +static int acpi_memory_device_add(struct acpi_device *device); > >> +static int acpi_memory_device_remove(struct acpi_device *device, > >> int type); + +static const struct acpi_device_id memory_device_ids[] > >> = { + {ACPI_MEMORY_DEVICE_HID, 0}, > >> + {"", 0}, > >> +}; > >> + > >> +static struct acpi_driver acpi_memory_device_driver = { > >> + .name = "acpi_memhotplug", > > > > Not 'xen_acpi_memhotplug' ? > > No, here driver name (same as native driver name) used to block native driver > loading. Then you need a comment in the file explaining that. > > > > >> + .class = ACPI_MEMORY_DEVICE_CLASS, > >> + .ids = memory_device_ids, > >> + .ops = { > >> + .add = acpi_memory_device_add, > >> + .remove = acpi_memory_device_remove, > > > > Just for sake of clarity I would prefix those with 'xen_'. > > OK. > > > > >> + }, > >> +}; > >> + > >> +struct acpi_memory_info { > >> + struct list_head list; > >> + u64 start_addr; /* Memory Range start physical addr */ > >> + u64 length; /* Memory Range length */ > >> + unsigned short caching; /* memory cache attribute */ > >> + unsigned short write_protect; /* memory read/write attribute */ > > > > Can't the write_protect by a bit field like the 'enabled'? So > > unsigned int write_protect:1; > > ? > > Seems not good, write_protect copied from an acpi buffer (byte3) getting from > _CRS evaluation. Ah, pls put a comment in there as well why that cannot be done. > > >> + unsigned int enabled:1; > >> +}; > >> + > >> +struct acpi_memory_device { > >> + struct acpi_device *device; > >> + struct list_head res_list; > >> +}; > >> + > >> +static int acpi_hotmem_initialized; > > > > Just make it a bool and also use __read_mostly please. > > OK. > > > > >> + > >> + > >> +int xen_acpi_memory_enable_device(struct acpi_memory_device > >> *mem_device) +{ + return 0; > >> +} > > > > Why even have this function if it does not do anything? > > Not a nop, it implemented at patch 2/2. Yup, saw it in the next patch. > > > > >> + > >> +static acpi_status > >> +acpi_memory_get_resource(struct acpi_resource *resource, void > >> *context) +{ + struct acpi_memory_device *mem_device = context; > >> + struct acpi_resource_address64 address64; > >> + struct acpi_memory_info *info, *new; > >> + acpi_status status; > >> + > >> + status = acpi_resource_to_address64(resource, &address64); + if > >> (ACPI_FAILURE(status) || + (address64.resource_type != > >> ACPI_MEMORY_RANGE)) + return AE_OK; + > >> + list_for_each_entry(info, &mem_device->res_list, list) { > >> + /* Can we combine the resource range information? */ > > > > I don't know? Is this is a future TODO? > > I'm also not quite sure, this comments ported from native side. OK, pls find out. Perhaps this comment is stale. > > > > >> + if ((info->caching == address64.info.mem.caching) && > >> + (info->write_protect == address64.info.mem.write_protect) && > >> + (info->start_addr + info->length == address64.minimum)) { > >> + info->length += address64.address_length; > >> + return AE_OK; > >> + } > >> + } > >> + > >> + new = kzalloc(sizeof(struct acpi_memory_info), GFP_KERNEL); + if > >> (!new) + return AE_ERROR; > >> + > >> + INIT_LIST_HEAD(&new->list); > >> + new->caching = address64.info.mem.caching; > >> + new->write_protect = address64.info.mem.write_protect; > >> + new->start_addr = address64.minimum; > >> + new->length = address64.address_length; > >> + list_add_tail(&new->list, &mem_device->res_list); + > >> + return AE_OK; > >> +} > >> + > >> +static int > >> +acpi_memory_get_device_resources(struct acpi_memory_device > >> *mem_device) +{ + acpi_status status; > >> + struct acpi_memory_info *info, *n; > >> + > >> + if (!list_empty(&mem_device->res_list)) > >> + return 0; > >> + > >> + status = acpi_walk_resources(mem_device->device->handle, > >> + METHOD_NAME__CRS, acpi_memory_get_resource, mem_device); + > >> + if (ACPI_FAILURE(status)) { > >> + list_for_each_entry_safe(info, n, &mem_device->res_list, list) > >> + kfree(info); + > >> INIT_LIST_HEAD(&mem_device->res_list); > >> + return -EINVAL; > >> + } > >> + > >> + return 0; > >> +} > >> + > >> +static int > >> +acpi_memory_get_device(acpi_handle handle, > >> + struct acpi_memory_device **mem_device) > >> +{ > >> + acpi_status status; > >> + acpi_handle phandle; > >> + struct acpi_device *device = NULL; > >> + struct acpi_device *pdevice = NULL; > >> + int result; > >> + > >> + if (!acpi_bus_get_device(handle, &device) && device) + goto > >> end; > >> + > >> + status = acpi_get_parent(handle, &phandle); > >> + if (ACPI_FAILURE(status)) { > >> + pr_warn(PREFIX "Cannot find acpi parent\n"); > >> + return -EINVAL; > >> + } > >> + > >> + /* Get the parent device */ > >> + result = acpi_bus_get_device(phandle, &pdevice); > >> + if (result) { > >> + pr_warn(PREFIX "Cannot get acpi bus device\n"); > >> + return -EINVAL; > >> + } > >> + > >> + /* > >> + * Now add the notified device. This creates the acpi_device > >> + * and invokes .add function > >> + */ > >> + result = acpi_bus_add(&device, pdevice, handle, > >> ACPI_BUS_TYPE_DEVICE); + if (result) { + pr_warn(PREFIX "Cannot > >> add > >> acpi bus\n"); + return -EINVAL; > >> + } > >> + > >> +end: > >> + *mem_device = acpi_driver_data(device); > >> + if (!(*mem_device)) { > >> + pr_err(PREFIX "Driver data not found\n"); > >> + return -ENODEV; > >> + } > >> + > >> + return 0; > >> +} > >> + > >> +static int acpi_memory_check_device(struct acpi_memory_device > >> *mem_device) +{ + unsigned long long current_status; > >> + > >> + /* Get device present/absent information from the _STA */ > >> + if (ACPI_FAILURE(acpi_evaluate_integer(mem_device->device->handle, > >> + "_STA", NULL, ¤t_status))) > >> + return -ENODEV; > >> + /* > >> + * Check for device status. Device should be > >> + * present/enabled/functioning. > >> + */ > >> + if (!((current_status & ACPI_STA_DEVICE_PRESENT) > >> + && (current_status & ACPI_STA_DEVICE_ENABLED) > >> + && (current_status & ACPI_STA_DEVICE_FUNCTIONING))) > >> + return -ENODEV; + > >> + return 0; > >> +} > >> + > >> +static int acpi_memory_disable_device(struct acpi_memory_device > >> *mem_device) +{ + pr_warn(PREFIX "Xen does not support memory > >> hotremove\n"); + + return -ENOSYS; > >> +} > >> + > >> +static void acpi_memory_device_notify(acpi_handle handle, u32 > >> event, void *data) +{ + struct acpi_memory_device *mem_device; > >> + struct acpi_device *device; > >> + u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */ + > >> + switch (event) { > >> + case ACPI_NOTIFY_BUS_CHECK: > >> + ACPI_DEBUG_PRINT((ACPI_DB_INFO, > >> + "\nReceived BUS CHECK notification for device\n")); + > >> /* Fall > >> Through */ + case ACPI_NOTIFY_DEVICE_CHECK: > >> + if (event == ACPI_NOTIFY_DEVICE_CHECK) > >> + ACPI_DEBUG_PRINT((ACPI_DB_INFO, > >> + "\nReceived DEVICE CHECK notification for device\n")); + > >> + if (acpi_memory_get_device(handle, &mem_device)) { > >> + pr_err(PREFIX "Cannot find driver data\n"); > >> + break; > >> + } > >> + > >> + ost_code = ACPI_OST_SC_SUCCESS; > >> + break; > >> + > >> + case ACPI_NOTIFY_EJECT_REQUEST: > >> + ACPI_DEBUG_PRINT((ACPI_DB_INFO, > >> + "\nReceived EJECT REQUEST notification for device\n")); > >> + > >> + if (acpi_bus_get_device(handle, &device)) { > >> + pr_err(PREFIX "Device doesn't exist\n"); > >> + break; > >> + } > >> + mem_device = acpi_driver_data(device); > >> + if (!mem_device) { > >> + pr_err(PREFIX "Driver Data is NULL\n"); > >> + break; > >> + } > >> + > >> + /* > >> + * TBD: implement acpi_memory_disable_device and invoke > >> + * acpi_bus_remove if Xen support hotremove in the future + > >> */ > >> + acpi_memory_disable_device(mem_device); > >> + break; > >> + > >> + default: > >> + ACPI_DEBUG_PRINT((ACPI_DB_INFO, > >> + "Unsupported event [0x%x]\n", event)); > >> + /* non-hotplug event; possibly handled by other handler */ > >> + return; + } > >> + > >> + /* Inform firmware that the hotplug operation has completed */ > >> + (void) acpi_evaluate_hotplug_ost(handle, event, ost_code, NULL); > > > > > > Hm, even if we failed? Say for the ACPI_NOTIFY_EJECT_REQUEST ? > > OK, let's remove this the comments 'Inform firmware that the hotplug > operation has completed' > For ACPI_NOTIFY_EJECT_REQUEST, it in fact inform firmware > 'ACPI_OST_SC_NON_SPECIFIC_FAILURE'. > > > > >> + return; > >> +} > >> + > >> +static int acpi_memory_device_add(struct acpi_device *device) +{ > >> + int result; > >> + struct acpi_memory_device *mem_device = NULL; > >> + > >> + > >> + if (!device) > >> + return -EINVAL; > >> + > >> + mem_device = kzalloc(sizeof(struct acpi_memory_device), > >> GFP_KERNEL); + if (!mem_device) + return -ENOMEM; > >> + > >> + INIT_LIST_HEAD(&mem_device->res_list); > >> + mem_device->device = device; > >> + sprintf(acpi_device_name(device), "%s", ACPI_MEMORY_DEVICE_NAME); > >> + sprintf(acpi_device_class(device), "%s", ACPI_MEMORY_DEVICE_CLASS); > >> + device->driver_data = mem_device; > >> + > >> + /* Get the range from the _CRS */ > >> + result = acpi_memory_get_device_resources(mem_device); + if > >> (result) { + kfree(mem_device); > >> + return result; > >> + } > >> + > >> + /* > >> + * Early boot code has recognized memory area by EFI/E820. > >> + * If DSDT shows these memory devices on boot, hotplug is not > >> necessary + * for them. So, it just returns until completion of > >> this driver's + * start up. + */ > >> + if (!acpi_hotmem_initialized) > >> + return 0; > >> + > >> + if (!acpi_memory_check_device(mem_device)) > >> + result = xen_acpi_memory_enable_device(mem_device); > > > > This is a nop. Could you just do: > > result = 0; > > ? > > It implemented at patch 2/2. > > Thanks, > Jinsong > > > > >> + > >> + return result; > >> +} > >> + > >> +static int acpi_memory_device_remove(struct acpi_device *device, > >> int type) +{ + struct acpi_memory_device *mem_device = NULL; > >> + > >> + if (!device || !acpi_driver_data(device)) > >> + return -EINVAL; > >> + > >> + mem_device = acpi_driver_data(device); > >> + kfree(mem_device); > >> + > >> + return 0; > >> +} > >> + > >> +/* > >> + * Helper function to check for memory device > >> + */ > >> +static acpi_status is_memory_device(acpi_handle handle) +{ > >> + char *hardware_id; > >> + acpi_status status; > >> + struct acpi_device_info *info; > >> + > >> + status = acpi_get_object_info(handle, &info); > >> + if (ACPI_FAILURE(status)) > >> + return status; > >> + > >> + if (!(info->valid & ACPI_VALID_HID)) { > >> + kfree(info); > >> + return AE_ERROR; > >> + } > >> + > >> + hardware_id = info->hardware_id.string; > >> + if ((hardware_id == NULL) || > >> + (strcmp(hardware_id, ACPI_MEMORY_DEVICE_HID))) + status = > >> AE_ERROR; + > >> + kfree(info); > >> + return status; > >> +} > >> + > >> +static acpi_status > >> +acpi_memory_register_notify_handler(acpi_handle handle, > >> + u32 level, void *ctxt, void **retv) > >> +{ > >> + acpi_status status; > >> + > >> + status = is_memory_device(handle); > >> + if (ACPI_FAILURE(status)) > >> + return AE_OK; /* continue */ > >> + > >> + status = acpi_install_notify_handler(handle, ACPI_SYSTEM_NOTIFY, > >> + acpi_memory_device_notify, NULL); > >> + /* continue */ > >> + return AE_OK; > >> +} > >> + > >> +static int __init xen_acpi_memory_device_init(void) +{ > >> + int result; > >> + acpi_status status; > >> + > >> + /* only dom0 is responsible for xen acpi memory hotplug */ + if > >> (!xen_initial_domain()) + return -ENODEV; > >> + > >> + result = acpi_bus_register_driver(&acpi_memory_device_driver); > >> + if (result < 0) + return -ENODEV; > >> + > >> + status = acpi_walk_namespace(ACPI_TYPE_DEVICE, ACPI_ROOT_OBJECT, > >> + ACPI_UINT32_MAX, + > >> > >> acpi_memory_register_notify_handler, NULL, + > >> NULL, NULL); + > >> + if (ACPI_FAILURE(status)) { > >> + pr_warn(PREFIX "walk_namespace failed\n"); > >> + acpi_bus_unregister_driver(&acpi_memory_device_driver); + > >> return > >> -ENODEV; + } > >> + > >> + acpi_hotmem_initialized = 1; > >> + return 0; > >> +} > >> +subsys_initcall(xen_acpi_memory_device_init); > >> -- > >> 1.7.1 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/