Latest kernel doesn't boot
The latest linux kernel doesn't boot on my computer (h=21511abd0a248a3f225d3b611cfabb93124605a7). elilo hangs while booting this kernel. 2.6.24 works. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Latest kernel doesn't boot
Zitat von "H. Peter Anvin" <[EMAIL PROTECTED]>: > [EMAIL PROTECTED] wrote: > > The latest linux kernel doesn't boot on my computer > > (h=21511abd0a248a3f225d3b611cfabb93124605a7). > > > > elilo hangs while booting this kernel. 2.6.24 works. > > Wow, so we know it's affected with EFI, since you're using elilo. > > You gave absolutely zero other information about your system or what > "doesn't boot" mean. It's a macbook pro with a core duo processor (the one with only 32bits). Doesn't boot means: i enter the name of the kernel image i want to boot in elilo and press enter and nothing happens, it just hangs. i need to press the power button for a few seconds to power off. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
question regarding user mode linux
I try to setup a user mode linux session and dmesg says: "[cut] ubda: unknown partition table Choosing a random ethernet address for device eth0 Netdevice 0 (9e:69:a0:f3:f1:f0) : TUN/TAP backend - IP = 192.168.5.1 Filesystem "ubda": Disabling barriers, not supported by the underlying device [cut]" id est eth0. But the network device is: # cat /proc/net/dev Inter-| Receive| Transmit face |bytespackets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed lo: 0 0000 0 0 00 0000 0 0 0 eth6: 0 0000 0 0 00 0000 0 0 0 id est eth6 # uname -a Linux localhost 2.6.24-rc7-gd0c4c9d4 #1 Sat Jan 12 13:25:44 CET 2008 i686 UML User Mode Linux GNU/Linux Any ideas what could be wrong here? mfg thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[oops] ipppd/isdn
t symbol <= >>EIP; c010e33f<= Trace; c011415c <__run_task_queue+50/64> Trace; c012b47a <__wait_on_buffer+6a/8c> Trace; c012b61e Trace; c012b73a Trace; c0192d1c Trace; c0192d7a Trace; c011018e Trace; c0112c6b Trace; c010d658 Trace; c01070be Trace; c010d9a7 Trace; c010d658 Trace; c010f389 Trace; c010fd59 Trace; c01057c0 Trace; c0106c58 Code; c010e33f <_EIP>: Code; c010e33f<= 0: 0f 0b ud2a <= Code; c010e341 2: 8d 65 dc lea0xffdc(%ebp),%esp Code; c010e344 5: 5bpop%ebx Code; c010e345 6: 5epop%esi Code; c010e346 7: 5fpop%edi Code; c010e347 8: 89 ec mov%ebp,%esp Code; c010e349 a: 5dpop%ebp Code; c010e34a b: c3ret Code; c010e34b c: 90nop Code; c010e34c <__wake_up+0/90> d: 55push %ebp Code; c010e34d <__wake_up+1/90> e: 89 e5 mov%esp,%ebp Code; c010e34f <__wake_up+3/90> 10: 83 ec 0c sub$0xc,%esp Code; c010e352 <__wake_up+6/90> 13: 57push %edi <0>Kernel panic: aiee, killing interrupt handler! 3 warnings issued. Results may not be reliable. [3] syslog: Jun 29 23:28:49 knecht ipppd[237]: Modem hangup Jun 29 23:28:49 knecht ipppd[237]: Connection terminated. Jun 29 23:28:49 knecht ipppd[237]: taking down PHASE_DEAD link 0, linkunit: 0 Jun 29 23:28:49 knecht ipppd[237]: closing fd 8 from unit 0 Jun 29 23:28:49 knecht ipppd[237]: link 0 closed , linkunit: 0 Jun 29 23:28:49 knecht ipppd[237]: reinit_unit: 0 Jun 29 23:28:49 knecht ipppd[237]: Connect[0]: /dev/ippp0, fd: 8 Jun 29 23:28:49 knecht kernel: ippp_ccp: freeing reset data structure c352c000 Jun 29 23:28:49 knecht kernel: ippp, open, slot: 0, minor: 0, state: Jun 29 23:28:49 knecht kernel: ippp_ccp: allocated reset data structure c352c000 Jun 29 23:28:49 knecht ipppd[237]: Modem hangup Jun 29 23:28:49 knecht ipppd[237]: Connection terminated. Jun 29 23:28:49 knecht ipppd[237]: taking down PHASE_DEAD link 1, linkunit: 1 Jun 29 23:28:49 knecht ipppd[237]: closing fd 9 from unit 1 Jun 29 23:28:49 knecht ipppd[237]: link 1 closed , linkunit: 1 Jun 29 23:28:49 knecht ipppd[237]: reinit_unit: 1 Jun 29 23:28:49 knecht ipppd[237]: Connect[1]: /dev/ippp1, fd: 9 Jun 29 23:28:49 knecht kernel: ippp_ccp: freeing reset data structure c352c800 Jun 29 23:28:49 knecht kernel: ippp, open, slot: 1, minor: 1, state: Jun 29 23:28:49 knecht kernel: ippp_ccp: allocated reset data structure c352c800 Jun 29 23:30:21 knecht ipppd[237]: Local number: x, Remote number: x, Jun 29 23:30:21 knecht ipppd[237]: PHASE_WAIT -> PHASE_ESTABLISHED, ifunit: 0, l Jun 29 23:30:21 knecht ipppd[237]: Remote message: Jun 29 23:30:21 knecht ipppd[237]: MPPP negotiation, He: Yes We: Yes thx, thomas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: pdflush stuck in D state with v2.6.24-rc1-192-gef49c32
Florin Iucha iucha.net> writes: > > It's really curious - I tried your .config and commands, and still > > could not trigger the high iowait. I'm running 64bit Intel Core 2, > > and kernel 2.6.24-rc1-git6 with the above patch. > > Curious but 100% reproducible, at least on my box. What I'm going to > try is booting into the kernel with your patch and just doing the find > / md5sum. It would be really interesting if the read-only access > triggers it. > > florin > I can confirm this issue too on any .24-rc. I'm also using reiserfs on a LVM. And there is one more user on Gentoo forums having the same issue. http://forums.gentoo.org/viewtopic-t-612959.html So you are not alone, florian. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
New patch: drm-populated memory types
This one incorporates some of Arjan's suggestions and a fix for the i810 problem introduced with the previous patch. /Thomas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] agpgart: Allow drm-populated agp memory types
From: Thomas Hellstrom <[EMAIL PROTECTED]> This patch allows drm to populate an agpgart structure with pages of its own. It's needed for the new drm memory manager which dynamically flips pages in and out of AGP. The patch modifies the generic functions as well as the intel agp driver. The intel drm driver is currently the only one supporting the new memory manager. Other agp drivers may need some minor fixing up once they have a corresponding memory manager enabled drm driver. AGP memory types >= AGP_USER_TYPES are not populated by the agpgart driver, but the drm is expected to do that, as well as taking care of cache- and tlb flushing when needed. It's not possible to request these types from user space using agpgart ioctls. The Intel driver also gets a new memory type for pages that can be bound cached to the intel GTT. Signed-off-by: Thomas Hellstrom <[EMAIL PROTECTED]> --- drivers/char/agp/agp.h | 10 ++ drivers/char/agp/ali-agp.c |2 drivers/char/agp/alpha-agp.c|4 + drivers/char/agp/amd-k7-agp.c |1 drivers/char/agp/amd64-agp.c| 11 ++ drivers/char/agp/ati-agp.c |1 drivers/char/agp/backend.c |2 drivers/char/agp/efficeon-agp.c |1 drivers/char/agp/frontend.c |3 + drivers/char/agp/generic.c | 130 ++- drivers/char/agp/hp-agp.c |1 drivers/char/agp/i460-agp.c |7 + drivers/char/agp/intel-agp.c| 186 +-- drivers/char/agp/nvidia-agp.c |1 drivers/char/agp/sgi-agp.c |1 drivers/char/agp/sworks-agp.c |1 drivers/char/agp/uninorth-agp.c |2 drivers/char/agp/via-agp.c |2 include/linux/agp_backend.h |5 + 19 files changed, 296 insertions(+), 75 deletions(-) diff --git a/drivers/char/agp/agp.h b/drivers/char/agp/agp.h index 1d59e2a..f821243 100644 --- a/drivers/char/agp/agp.h +++ b/drivers/char/agp/agp.h @@ -114,6 +114,7 @@ struct agp_bridge_driver { void (*free_by_type)(struct agp_memory *); void *(*agp_alloc_page)(struct agp_bridge_data *); void (*agp_destroy_page)(void *); +int (*agp_type_to_mask_type) (struct agp_bridge_data *, int); }; struct agp_bridge_data { @@ -218,6 +219,7 @@ #define I810_PTE_BASE 0x1 #define I810_PTE_MAIN_UNCACHED 0x #define I810_PTE_LOCAL 0x0002 #define I810_PTE_VALID 0x0001 +#define I830_PTE_SYSTEM_CACHED 0x0006 #define I810_SMRAM_MISCC 0x70 #define I810_GFX_MEM_WIN_SIZE 0x0001 #define I810_GFX_MEM_WIN_32M 0x0001 @@ -270,8 +272,16 @@ void global_cache_flush(void); void get_agp_version(struct agp_bridge_data *bridge); unsigned long agp_generic_mask_memory(struct agp_bridge_data *bridge, unsigned long addr, int type); +int agp_generic_type_to_mask_type(struct agp_bridge_data *bridge, + int type); struct agp_bridge_data *agp_generic_find_bridge(struct pci_dev *pdev); +/* generic functions for user-populated AGP memory types */ +struct agp_memory *agp_generic_alloc_user(size_t page_count, int type); +void agp_alloc_page_array(size_t size, struct agp_memory *mem); +void agp_free_page_array(struct agp_memory *mem); + + /* generic routines for agp>=3 */ int agp3_generic_fetch_size(void); void agp3_generic_tlbflush(struct agp_memory *mem); diff --git a/drivers/char/agp/ali-agp.c b/drivers/char/agp/ali-agp.c index 5a31ec7..98177a9 100644 --- a/drivers/char/agp/ali-agp.c +++ b/drivers/char/agp/ali-agp.c @@ -214,6 +214,7 @@ static struct agp_bridge_driver ali_gene .free_by_type = agp_generic_free_by_type, .agp_alloc_page = agp_generic_alloc_page, .agp_destroy_page = ali_destroy_page, + .agp_type_to_mask_type = agp_generic_type_to_mask_type, }; static struct agp_bridge_driver ali_m1541_bridge = { @@ -237,6 +238,7 @@ static struct agp_bridge_driver ali_m154 .free_by_type = agp_generic_free_by_type, .agp_alloc_page = m1541_alloc_page, .agp_destroy_page = m1541_destroy_page, + .agp_type_to_mask_type = agp_generic_type_to_mask_type, }; diff --git a/drivers/char/agp/alpha-agp.c b/drivers/char/agp/alpha-agp.c index b4e00a3..b0acf41 100644 --- a/drivers/char/agp/alpha-agp.c +++ b/drivers/char/agp/alpha-agp.c @@ -91,6 +91,9 @@ static int alpha_core_agp_insert_memory( int num_entries, status; void *temp; + if (type >= AGP_USER_TYPES || mem->type >= AGP_USER_TYPES) + return -EINVAL; + temp = agp_bridge->current_size; num_entries = A_SIZE_FIX(temp)->num_entries; if ((pg_start + mem->page_count) > num_entries) @@ -142,6 +145,7 @@ struct agp_bridge_driver alpha_core_agp_ .free_by_type = agp_generic_free_by_type, .agp_alloc_page = agp_generic_alloc_page, .
[PATCH] agpgart: Allow drm-populated agp memory types update2
From: Thomas Hellstrom <[EMAIL PROTECTED]> This patch allows drm to populate an agpgart structure with pages of its own. It's needed for the new drm memory manager which dynamically flips pages in and out of AGP. The patch modifies the generic functions as well as the intel agp driver. The intel drm driver is currently the only one supporting the new memory manager. Other agp drivers may need some minor fixing up once they have a corresponding memory manager enabled drm driver. AGP memory types >= AGP_USER_TYPES are not populated by the agpgart driver, but the drm is expected to do that, as well as taking care of cache- and tlb flushing when needed. It's not possible to request these types from user space using agpgart ioctls. The Intel driver also gets a new memory type for pages that can be bound cached to the intel GTT. Signed-off-by: Thomas Hellstrom <[EMAIL PROTECTED]> --- drivers/char/agp/agp.h | 10 ++ drivers/char/agp/ali-agp.c |2 drivers/char/agp/alpha-agp.c|4 + drivers/char/agp/amd-k7-agp.c |1 drivers/char/agp/amd64-agp.c| 11 ++ drivers/char/agp/ati-agp.c |1 drivers/char/agp/backend.c |2 drivers/char/agp/efficeon-agp.c |1 drivers/char/agp/frontend.c |3 + drivers/char/agp/generic.c | 130 ++- drivers/char/agp/hp-agp.c |1 drivers/char/agp/i460-agp.c |7 + drivers/char/agp/intel-agp.c| 186 +-- drivers/char/agp/nvidia-agp.c |1 drivers/char/agp/parisc-agp.c |1 drivers/char/agp/sgi-agp.c |1 drivers/char/agp/sis-agp.c |1 drivers/char/agp/sworks-agp.c |1 drivers/char/agp/uninorth-agp.c |2 drivers/char/agp/via-agp.c |2 include/linux/agp_backend.h |5 + 21 files changed, 298 insertions(+), 75 deletions(-) diff --git a/drivers/char/agp/agp.h b/drivers/char/agp/agp.h index 1d59e2a..f821243 100644 --- a/drivers/char/agp/agp.h +++ b/drivers/char/agp/agp.h @@ -114,6 +114,7 @@ struct agp_bridge_driver { void (*free_by_type)(struct agp_memory *); void *(*agp_alloc_page)(struct agp_bridge_data *); void (*agp_destroy_page)(void *); +int (*agp_type_to_mask_type) (struct agp_bridge_data *, int); }; struct agp_bridge_data { @@ -218,6 +219,7 @@ #define I810_PTE_BASE 0x1 #define I810_PTE_MAIN_UNCACHED 0x #define I810_PTE_LOCAL 0x0002 #define I810_PTE_VALID 0x0001 +#define I830_PTE_SYSTEM_CACHED 0x0006 #define I810_SMRAM_MISCC 0x70 #define I810_GFX_MEM_WIN_SIZE 0x0001 #define I810_GFX_MEM_WIN_32M 0x0001 @@ -270,8 +272,16 @@ void global_cache_flush(void); void get_agp_version(struct agp_bridge_data *bridge); unsigned long agp_generic_mask_memory(struct agp_bridge_data *bridge, unsigned long addr, int type); +int agp_generic_type_to_mask_type(struct agp_bridge_data *bridge, + int type); struct agp_bridge_data *agp_generic_find_bridge(struct pci_dev *pdev); +/* generic functions for user-populated AGP memory types */ +struct agp_memory *agp_generic_alloc_user(size_t page_count, int type); +void agp_alloc_page_array(size_t size, struct agp_memory *mem); +void agp_free_page_array(struct agp_memory *mem); + + /* generic routines for agp>=3 */ int agp3_generic_fetch_size(void); void agp3_generic_tlbflush(struct agp_memory *mem); diff --git a/drivers/char/agp/ali-agp.c b/drivers/char/agp/ali-agp.c index 5a31ec7..98177a9 100644 --- a/drivers/char/agp/ali-agp.c +++ b/drivers/char/agp/ali-agp.c @@ -214,6 +214,7 @@ static struct agp_bridge_driver ali_gene .free_by_type = agp_generic_free_by_type, .agp_alloc_page = agp_generic_alloc_page, .agp_destroy_page = ali_destroy_page, + .agp_type_to_mask_type = agp_generic_type_to_mask_type, }; static struct agp_bridge_driver ali_m1541_bridge = { @@ -237,6 +238,7 @@ static struct agp_bridge_driver ali_m154 .free_by_type = agp_generic_free_by_type, .agp_alloc_page = m1541_alloc_page, .agp_destroy_page = m1541_destroy_page, + .agp_type_to_mask_type = agp_generic_type_to_mask_type, }; diff --git a/drivers/char/agp/alpha-agp.c b/drivers/char/agp/alpha-agp.c index b4e00a3..b0acf41 100644 --- a/drivers/char/agp/alpha-agp.c +++ b/drivers/char/agp/alpha-agp.c @@ -91,6 +91,9 @@ static int alpha_core_agp_insert_memory( int num_entries, status; void *temp; + if (type >= AGP_USER_TYPES || mem->type >= AGP_USER_TYPES) + return -EINVAL; + temp = agp_bridge->current_size; num_entries = A_SIZE_FIX(temp)->num_entries; if ((pg_start + mem->page_count) > num_entries) @@ -142,6 +145,7 @@ struct agp_br
[PATCH] [AGPGART] Add agp-type-to-mask-type method missing from some drivers.
From: Thomas Hellstrom <[EMAIL PROTECTED]> diff --git a/drivers/char/agp/parisc-agp.c b/drivers/char/agp/parisc-agp.c index 17c50b0..b7b4590 100644 --- a/drivers/char/agp/parisc-agp.c +++ b/drivers/char/agp/parisc-agp.c @@ -228,6 +228,7 @@ struct agp_bridge_driver parisc_agp_driv .free_by_type = agp_generic_free_by_type, .agp_alloc_page = agp_generic_alloc_page, .agp_destroy_page = agp_generic_destroy_page, + .agp_type_to_mask_type = agp_generic_type_to_mask_type, .cant_use_aperture = 1, }; diff --git a/drivers/char/agp/sis-agp.c b/drivers/char/agp/sis-agp.c index a00fd48..60342b7 100644 --- a/drivers/char/agp/sis-agp.c +++ b/drivers/char/agp/sis-agp.c @@ -140,6 +140,7 @@ static struct agp_bridge_driver sis_driv .free_by_type = agp_generic_free_by_type, .agp_alloc_page = agp_generic_alloc_page, .agp_destroy_page = agp_generic_destroy_page, + .agp_type_to_mask_type = agp_generic_type_to_mask_type, }; static struct agp_device_ids sis_agp_device_ids[] __devinitdata = -- 1.4.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
agpgart: drm-populated memory types
Dave and Arjan, I'm resending a slightly reworked version of the apgart patch for drm-populated memory types. The address- based vmalloc / vfree has been replaced and encapsulated in agp-vkmalloc / agp vkfree which both takes a flag argument to indicate whether to use vmalloc or kmalloc. This, at least, gets rid of the portability problem, and the chances of running into trouble in the future will be small if all allocs / frees of these memory areas are done using these functions. A short recap why I belive the kmalloc / vmalloc construct is necessary: 0) The current code uses vmalloc only. 1) The allocated area ranges from 4 bytes possibly up to 512 kB, depending on on the size of the AGP buffer allocated. 2) Large buffers are very few. Small buffers tend to be quite many. If we continue to use vmalloc only or another page-based scheme we will waste approx one page per buffer, together with the added slowness of vmalloc. This will severely hurt applications with a lot of small texture buffers. Please let me know if you still consider this unacceptable. In that case I suggest sticking with vmalloc for now. Also please let me know if there are other parths of the patch that should be reworked. The patch that follows is against Dave's agpgart repo. Regards, Thomas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] agpgart: Allow drm-populated agp memory types
From: Thomas Hellstrom <[EMAIL PROTECTED]> This patch allows drm to populate an agpgart structure with pages of its own. It's needed for the new drm memory manager which dynamically flips pages in and out of AGP. The patch modifies the generic functions as well as the intel agp driver. The intel drm driver is currently the only one supporting the new memory manager. Other agp drivers may need some minor fixing up once they have a corresponding memory manager enabled drm driver. AGP memory types >= AGP_USER_TYPES are not populated by the agpgart driver, but the drm is expected to do that, as well as taking care of cache- and tlb flushing when needed. It's not possible to request these types from user space using agpgart ioctls. The Intel driver also gets a new memory type for pages that can be bound cached to the intel GTT. Signed-off-by: Thomas Hellstrom <[EMAIL PROTECTED]> --- drivers/char/agp/agp.h | 10 ++ drivers/char/agp/ali-agp.c |2 drivers/char/agp/alpha-agp.c|4 + drivers/char/agp/amd-k7-agp.c |1 drivers/char/agp/amd64-agp.c| 11 ++ drivers/char/agp/ati-agp.c |1 drivers/char/agp/backend.c |2 drivers/char/agp/efficeon-agp.c |1 drivers/char/agp/frontend.c |3 + drivers/char/agp/generic.c | 133 +++- drivers/char/agp/hp-agp.c |1 drivers/char/agp/i460-agp.c |7 + drivers/char/agp/intel-agp.c| 185 +-- drivers/char/agp/nvidia-agp.c |1 drivers/char/agp/sgi-agp.c |1 drivers/char/agp/sworks-agp.c |1 drivers/char/agp/uninorth-agp.c |2 drivers/char/agp/via-agp.c |2 include/linux/agp_backend.h |5 + 19 files changed, 298 insertions(+), 75 deletions(-) diff --git a/drivers/char/agp/agp.h b/drivers/char/agp/agp.h index 1d59e2a..7c75389 100644 --- a/drivers/char/agp/agp.h +++ b/drivers/char/agp/agp.h @@ -114,6 +114,7 @@ struct agp_bridge_driver { void (*free_by_type)(struct agp_memory *); void *(*agp_alloc_page)(struct agp_bridge_data *); void (*agp_destroy_page)(void *); +int (*agp_type_to_mask_type) (struct agp_bridge_data *, int); }; struct agp_bridge_data { @@ -218,6 +219,7 @@ #define I810_PTE_BASE 0x1 #define I810_PTE_MAIN_UNCACHED 0x #define I810_PTE_LOCAL 0x0002 #define I810_PTE_VALID 0x0001 +#define I830_PTE_SYSTEM_CACHED 0x0006 #define I810_SMRAM_MISCC 0x70 #define I810_GFX_MEM_WIN_SIZE 0x0001 #define I810_GFX_MEM_WIN_32M 0x0001 @@ -270,8 +272,16 @@ void global_cache_flush(void); void get_agp_version(struct agp_bridge_data *bridge); unsigned long agp_generic_mask_memory(struct agp_bridge_data *bridge, unsigned long addr, int type); +int agp_generic_type_to_mask_type(struct agp_bridge_data *bridge, + int type); struct agp_bridge_data *agp_generic_find_bridge(struct pci_dev *pdev); +/* generic functions for user-populated AGP memory types */ +struct agp_memory *agp_generic_alloc_user(size_t page_count, int type); +void agp_vkmalloc(size_t size, unsigned long **addr, u8 *vmalloc_flag); +void agp_vkfree(unsigned long *addr, u8 vmalloc_flag); + + /* generic routines for agp>=3 */ int agp3_generic_fetch_size(void); void agp3_generic_tlbflush(struct agp_memory *mem); diff --git a/drivers/char/agp/ali-agp.c b/drivers/char/agp/ali-agp.c index 5a31ec7..98177a9 100644 --- a/drivers/char/agp/ali-agp.c +++ b/drivers/char/agp/ali-agp.c @@ -214,6 +214,7 @@ static struct agp_bridge_driver ali_gene .free_by_type = agp_generic_free_by_type, .agp_alloc_page = agp_generic_alloc_page, .agp_destroy_page = ali_destroy_page, + .agp_type_to_mask_type = agp_generic_type_to_mask_type, }; static struct agp_bridge_driver ali_m1541_bridge = { @@ -237,6 +238,7 @@ static struct agp_bridge_driver ali_m154 .free_by_type = agp_generic_free_by_type, .agp_alloc_page = m1541_alloc_page, .agp_destroy_page = m1541_destroy_page, + .agp_type_to_mask_type = agp_generic_type_to_mask_type, }; diff --git a/drivers/char/agp/alpha-agp.c b/drivers/char/agp/alpha-agp.c index b4e00a3..b0acf41 100644 --- a/drivers/char/agp/alpha-agp.c +++ b/drivers/char/agp/alpha-agp.c @@ -91,6 +91,9 @@ static int alpha_core_agp_insert_memory( int num_entries, status; void *temp; + if (type >= AGP_USER_TYPES || mem->type >= AGP_USER_TYPES) + return -EINVAL; + temp = agp_bridge->current_size; num_entries = A_SIZE_FIX(temp)->num_entries; if ((pg_start + mem->page_count) > num_entries) @@ -142,6 +145,7 @@ struct agp_bridge_driver alpha_core_agp_ .free_by_type = agp_generic_free_by_type, .agp_alloc_page =
Re: 2.6.19-rc6-rt5
Something is really wrong with page alloc on this one. Compiled 2.6.19-rc6-rt5 with the one patch to page_alloc.c as posted on the list here. Kernel uses around 50% mem and 30% swap without doing anything. I get a lot of these: X invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0 [] out_of_memory+0x176/0x1d0 [] __alloc_pages+0x286/0x2f0 [] __get_free_pages+0x46/0x60 [] __pollwait+0xb0/0x100 [] unix_poll+0xc6/0xd0 [] sock_poll+0x23/0x30 [] do_select+0x288/0x4c0 [] __pollwait+0x0/0x100 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] default_wake_function+0x0/0x20 [] core_sys_select+0x223/0x360 [] __schedule+0x2e9/0x6b0 [] convert_fxsr_from_user+0x22/0xf0 [] sys_select+0xff/0x1e0 [] sys_gettimeofday+0x3b/0x90 [] sysenter_past_esp+0x56/0x79 === Mem-info: DMA per-cpu: CPU0: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 Normal per-cpu: CPU0: Hot: hi: 186, btch: 31 usd: 31 Cold: hi: 62, btch: 15 usd: 58 HighMem per-cpu: CPU0: Hot: hi: 186, btch: 31 usd: 66 Cold: hi: 62, btch: 15 usd: 14 Active:111463 inactive:36109 dirty:0 writeback:0 unstable:0 free:4018 slab:163934 mapped:26114 pagetables:874 DMA free:3560kB min:68kB low:84kB high:100kB active:396kB inactive:356kB present:16256kB pages_scanned:1370 all_unreclaimable? yes lowmem_reserve[]: 0 873 1254 Normal free:3720kB min:3744kB low:4680kB high:5616kB active:111304kB inactive:108296kB present:894080kB pages_scanned:339028 all_unreclaimable? yes lowmem_reserve[]: 0 0 3047 HighMem free:8792kB min:380kB low:788kB high:1196kB active:334152kB inactive:35784kB present:390084kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 0*4kB 1*8kB 0*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3560kB Normal: 0*4kB 5*8kB 0*16kB 1*32kB 1*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3720kB HighMem: 924*4kB 517*8kB 40*16kB 2*32kB 0*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 8792kB Swap cache: add 107141, delete 56933, find 4493/5856, race 0+0 Free swap = 113440kB Total swap = 488336kB Free swap: 113440kB 327664 pages of RAM 98288 pages of HIGHMEM 4383 reserved pages 94253 pages shared 50208 pages swap cached 0 pages dirty 0 pages writeback 26114 pages mapped 163934 pages slab 874 pages pagetables 327664 pages of RAM 98288 pages of HIGHMEM 4383 reserved pages 94253 pages shared 50208 pages swap cached 0 pages dirty 0 pages writeback 26114 pages mapped 163934 pages slab 874 pages pagetables audacious invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0 [] out_of_memory+0x176/0x1d0 [] __alloc_pages+0x286/0x2f0 [] cache_alloc_refill+0x30e/0x5d0 [] kmem_cache_alloc+0x57/0x60 [] sock_alloc_inode+0x19/0x60 [] alloc_inode+0x19/0x190 [] fget_light+0x85/0xa0 [] new_inode+0x16/0x90 [] sock_alloc+0x14/0x70 [] sys_accept+0x56/0x270 [] do_notify_resume+0x402/0x750 [] convert_fxsr_from_user+0x22/0xf0 [] sys_socketcall+0xd1/0x280 [] sysenter_past_esp+0x56/0x79 === Mem-info: DMA per-cpu: CPU0: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 Normal per-cpu: CPU0: Hot: hi: 186, btch: 31 usd: 31 Cold: hi: 62, btch: 15 usd: 58 HighMem per-cpu: CPU0: Hot: hi: 186, btch: 31 usd: 66 Cold: hi: 62, btch: 15 usd: 14 Active:111494 inactive:36078 dirty:0 writeback:0 unstable:0 free:4018 slab:163934 mapped:26114 pagetables:874 DMA free:3560kB min:68kB low:84kB high:100kB active:396kB inactive:356kB present:16256kB pages_scanned:1370 all_unreclaimable? yes lowmem_reserve[]: 0 873 1254 Normal free:3720kB min:3744kB low:4680kB high:5616kB active:111420kB inactive:108180kB present:894080kB pages_scanned:339127 all_unreclaimable? yes lowmem_reserve[]: 0 0 3047 HighMem free:8792kB min:380kB low:788kB high:1196kB active:334160kB inactive:35776kB present:390084kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 0*4kB 1*8kB 0*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3560kB Normal: 0*4kB 5*8kB 0*16kB 1*32kB 1*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3720kB HighMem: 924*4kB 517*8kB 40*16kB 2*32kB 0*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 8792kB Swap cache: add 107141, delete 56933, find 4493/5856, race 0+0 Free swap = 113440kB Total swap = 488336kB Free swap:
[BUG] unable to handle kernel NULL pointer dereference at virtual address 00000003
This Oops happens under heavy load with http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7b104bcb8e460e45a1aebe3da9b86aacdb4cab12 head. Also i run the powertop tool in parallel of the build process. This is what i could capture with netconsole: "BUG: unable to handle kernel NULL pointer dereference at virtual address 0003 printing eip: c0136e32 *pde = Oops: [#1] SMP Modules linked in: applesmc evdev snd_seq snd_seq_device" System.map says: c0136d47 t tick_nohz_handler c0136e30 t match_entries c0136e5a t tstats_open match_entries is part of kernel/time/timer_stats.c and only used within timer_stats.c I guess without stack values this is hard to debug... I cc'ed Ingo Molnar and Thomas Gleixner as they signed of the initial patch for the timer_stats support: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=82f67cd9fca8c8762c15ba7ed0d5747588c1e221 mfg thomas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.21-rc3 regression - bug 8066
Hello. Regarding bug 8066 (http://bugzilla.kernel.org/show_bug.cgi?id=8066): Is there a particular reason, that prevents the patch becoming part of the 2.6.21 release? Without this patch i have no battery icon and this is a regression againt 2.6.20. with kind regards thomas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [5/6] 2.6.21-rc3: known regressions
> Subject : suspend/resume hangs until keypress > References : http://bugzilla.kernel.org/show_bug.cgi?id=8181 > Submitter : Tomas Janousek <[EMAIL PROTECTED]> > Status : unknown Can you please try to compile without nohz and without hrtimers and try it again? This is maybe the same error i encounter. See also: http://www.ussg.iu.edu/hypermail/linux/kernel/0703.1/1506.html with kind regards thomas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: evdev* devices change major/minors after suspend/resume (udev?)
Zitat von Soeren Sonnenburg <[EMAIL PROTECTED]>: > > Very concrete it is this evdev that may be missing... and just FYI this > also seems to cause trouble in Xorg - sometimes the appletouch mouse is > not yet back... > > /dev/input/by-id/usb-Apple_Computer_Apple_Internal_Keyboard_._Trackpad-event-kbd > -> ../event5 > > Any hints welcome, See also this discussion thread: http://www.uwsg.iu.edu/hypermail/linux/kernel/0703.3/0988.html But there is no solution for this problem right now. with kind regards thomas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Bonded magnets for your Motor
Dear Sir/Madam, This is Thomas from topmag,Shenzhen,China. From your website, we know that our magnet products may be used for your products. Our branch company also manufacture and supply raw materials of high calss praseodymium-neodymium-iron alloy, dysprosium, chromium metal etc. We are specialized in sintered & bonded NdFeB magnets, SmCo magnets, Alnico magnets, ferrite magnets and various kinds of magnetic assemblies. Looking foward to hearing from you. Thomas --- Sales Engineer Mobile: 0086-15889706837 Address: Dalang street,Longhua district, Shenzhen,PRC Tel:86-755-29019871 E-mail: topm...@163.com
xpad_probe: undefined reference to `led_classdev_register'
Hi. Current linus' git tree: x86_64-unknown-linux-gnu-ld: BFD 2.15 assertion fail /home/thomas/source/crosstool-0.43/build/x86_64-unknown-linux-gnu/gcc-3.4.5-glibc-2.3.6/binutils-2.15/bfd/linker.c:619 drivers/built-in.o(.text+0x20749d): In function `xpad_probe': : undefined reference to `led_classdev_register' drivers/built-in.o(.text+0x20756c): In function `xpad_disconnect': : undefined reference to `led_classdev_unregister' make: *** [.tmp_vmlinux1] Fehler 1 any ideas? mfg thomas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Magnetic assembles for your electric motor
Dear, This is Thomas from Chinese magnets company. My friend introduce me,most of your items may use the bonded magnets or injection magnets. Maybe we can help you for the magnet items,if you need,please feel free to let us know. Thanks Thomas --- Sales Engineer Mobile: 0086-15889706837 Address: Dalang street,Longhua district, Shenzhen,PRC Tel:86-755-29019871 E-mail: topm...@163.comn�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�&j:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ��&�)撷f��^j谦y�m��@A�a囤� 0鹅h���i
Re: [i915] BUG: Bad page state in process Xorg
Hi, It turns out that this seems to be a bug in udl DRM driver. I bisected the problem to this patch: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/udl?id=5dc9e1e87229cb786a5bb58ddd0d60fee6eb4641 With kind regards Thomas Am 22.11.2013 17:18 schrieb Daniel Vetter : > > On Fri, Nov 22, 2013 at 4:54 PM, Thomas Meyer wrote: > >> Am 22.11.2013 um 11:55 schrieb Daniel Vetter : > >> > >> On Fri, Nov 22, 2013 at 11:36 AM, Dave Airlie wrote: > >>>> Hi, > >>> > >>> cc'ing mailing list, > >>> > >>> Daniel any ideas? > >> > >> Nope, not really :( And no ideas how to triage this further - if it > >> takes 9 days to hit it eventually we'll have a real hard time. Or does > >> this happen even after just a short X run? > > > > Seems to happen every time while stopping the x server. Also after a short > > run time. > > > > The current fedora 3.11 kernel doesn't show this bug. I'm using fedora 19, > > with a self compiled kernel. > > > > I did turn on config-debug-pagealloc but this didn't show any wrongness. > > In that case I think the bisect is the fastest way to insight - atm > I'm really at loss what could be wrong here. > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch N�r��yb�X��ǧv�^�){.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a��� 0��h���i
Manufacturer of magnets/Supplier of Bosch
Dear sir, How are you? This is Thomas from China Topmagtech Ltd, We get your contact from my friend and knowing that you are buying some precised magnetized magnets for your item. We are professional manufacture Sintered NdFeB & Magnetic Assemblies. Our advantage is: Precise magnetization, complicated shape and magnetic assembly ect. Hope have chance to make long term cooperation with you. We can make our competitive price and free samples for you. Any question or request please feel free to contact us. Bests Regards Thomas --- Sales Engineer Mobile: 0086-15889706837 Address: New Sanyo Motor Industrial Park, Haoyi Villiage, Shajing Town, Baoan District, Shenzhen, China 518104 Tel:(86)755-29019871 Fax:(86) 755-29735987 Email: i...@topmagtech.com&topm...@163.com
Injection magnets for your Automation
Dear, Good day. May I have your attention?We are a professional manufacturer&exporter of magnet and magnetic products located in Shenzhen China,with two production line the annual capacity exceeds 1,000 tons. Thanks to skilled workers and engineers,some of them having more than 15 years experience in magnet industry,we are able to provide you good quality magnets and professionalism consultancy. Especially we are competitive in series N35-N50,H,SH,UH series etc..,that are widely used in motors,sensors,speakers,generators,wind turbines and other electric or industrial devices. Below are some pictures of our facilities,you can also view more product information at our topmag website. If we could be of any further assistance,pls don't hesitate to contact us directly,thank you. Bests Regards Thomas --- Sales Engineer Mobile: 0086-15889706837 Address: New Sanyo Motor Industrial Park, Haoyi Villiage, Shajing Town, Baoan District, Shenzhen, China 518104 Tel:(86)755-29019871 Fax:(86) 755-29735987 Email: i...@topmagtech.com&topm...@163.com
Re:magnet materials for your auto industry
Dear, This is Thomas from Chinese magnets company. My friend introduce me,most of your items may use the bonded magnets or injection magnets. Maybe we can help you for the magnet item,if you need,please feel free to let us know. Thanks Thomas --- Sales Engineer Mobile: 0086-15889706837 Address: New Sanyo Motor Industrial Park, Haoyi Villiage, Shajing Town, Baoan District, Shenzhen, China 518104 Tel:(86)755-29019871 Fax:(86) 755-29735987 Email: i...@topmagtech.com&topm...@163.com
Re:Motor parts magnet vendor-bonded and injection magnets/Topmagtech
Dear purchasing manager, This is Thomas from Chinese magnet company. As you may know the price of raw material has been higher, it would be a good timing for you to purchasing magnet for your items. Feel free to contact me for any further questions or enquiry. Bests Regards Thomas --- Sales Engineer Mobile: 0086-15889706837 Address: Dalang street,Longhua district, Shenzhen,PRC Tel:86-755-29019871 E-mail: topm...@163.com
Supplying High Quality NdFeB Materials
Dear, Good day. May I have your attention?We are a professional manufacturer&exporter of magnet and magnetic products located in Shenzhen China,with two production line the annual capacity exceeds 1,000 tons. Thanks to skilled workers and engineers,some of them having more than 15 years experience in magnet industry,we are able to provide you good quality magnets and professionalism consultancy. Especially we are competitive in series N35-N50,H,SH,UH series etc..,that are widely used in motors,sensors,speakers,generators,wind turbines and other electric or industrial devices. Below are some pictures of our facilities,you can also view more product information at our topmag website. If we could be of any further assistance,pls don't hesitate to contact us directly,thank you. Bests Regards Thomas --- Sales Engineer Mobile: 0086-15889706837 Address: New Sanyo Motor Industrial Park, Haoyi Villiage, Shajing Town, Baoan District, Shenzhen, China 518104 Tel:(86)755-29019871 Fax:(86) 755-29735987 Email: i...@topmagtech.com&topm...@163.com
Re: [PATCH 3/3] regulator: add device tree support for max8997
On 26 November 2012 19:41, Mark Brown wrote: > On Mon, Nov 26, 2012 at 07:16:04PM +0530, Thomas Abraham wrote: > >> and this patch applied cleanly. Could you please let me know if there >> is anything I need to be doing differently for this. > > Hrm, try applying it on the relevant topic branch. Your comments about > rebasing on top of MFD changes did suggest that there's something in the > MFD tree so I didn't check terribly closely. I tried applying this patch on the max8997 branch in your regulator tree. But this patch does not apply cleanly on that branch because commits "5eb9f2b96381" (regulator: remove use of __devexit_p), "a5023574d120" (regulator: remove use of __devinit) and "8dc995f56ef7" (regulator: remove use of __devexit) are not available on this branch but these commits are already in your for-next branch. I am not sure if it is of any help in rebasing this patch to the existing max8997 branch. If you could suggest on how I could prepare this patch so that applies cleanly for you, I could do that. Thanks, Thomas. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] regulator: add device tree support for max8997
Add device tree based discovery support for max8997. Cc: Karol Lewandowski Cc: Rajendra Nayak Cc: Rob Herring Cc: Grant Likely Signed-off-by: Thomas Abraham Acked-by: MyungJoo Ham Reviewed-by: Tomasz Figa --- This patch is based on 'topic/max8997' branch of Mark Brown's regulator tree. .../bindings/regulator/max8997-regulator.txt | 146 +++ drivers/mfd/max8997.c | 73 ++- drivers/regulator/max8997.c| 148 +++- include/linux/mfd/max8997-private.h|1 + include/linux/mfd/max8997.h|1 + 5 files changed, 366 insertions(+), 3 deletions(-) create mode 100644 Documentation/devicetree/bindings/regulator/max8997-regulator.txt diff --git a/Documentation/devicetree/bindings/regulator/max8997-regulator.txt b/Documentation/devicetree/bindings/regulator/max8997-regulator.txt new file mode 100644 index 000..9fd69a1 --- /dev/null +++ b/Documentation/devicetree/bindings/regulator/max8997-regulator.txt @@ -0,0 +1,146 @@ +* Maxim MAX8997 Voltage and Current Regulator + +The Maxim MAX8997 is a multi-function device which includes volatage and +current regulators, rtc, charger controller and other sub-blocks. It is +interfaced to the host controller using a i2c interface. Each sub-block is +addressed by the host system using different i2c slave address. This document +describes the bindings for 'pmic' sub-block of max8997. + +Required properties: +- compatible: Should be "maxim,max8997-pmic". +- reg: Specifies the i2c slave address of the pmic block. It should be 0x66. + +- max8997,pmic-buck1-dvs-voltage: A set of 8 voltage values in micro-volt (uV) + units for buck1 when changing voltage using gpio dvs. Refer to [1] below + for additional information. + +- max8997,pmic-buck2-dvs-voltage: A set of 8 voltage values in micro-volt (uV) + units for buck2 when changing voltage using gpio dvs. Refer to [1] below + for additional information. + +- max8997,pmic-buck5-dvs-voltage: A set of 8 voltage values in micro-volt (uV) + units for buck5 when changing voltage using gpio dvs. Refer to [1] below + for additional information. + +[1] If none of the 'max8997,pmic-buck[1/2/5]-uses-gpio-dvs' optional +property is specified, the 'max8997,pmic-buck[1/2/5]-dvs-voltage' +property should specify atleast one voltage level (which would be a +safe operating voltage). + +If either of the 'max8997,pmic-buck[1/2/5]-uses-gpio-dvs' optional +property is specified, then all the eigth voltage values for the +'max8997,pmic-buck[1/2/5]-dvs-voltage' should be specified. + +Optional properties: +- interrupt-parent: Specifies the phandle of the interrupt controller to which + the interrupts from max8997 are delivered to. +- interrupts: Interrupt specifiers for two interrupt sources. + - First interrupt specifier is for 'irq1' interrupt. + - Second interrupt specifier is for 'alert' interrupt. +- max8997,pmic-buck1-uses-gpio-dvs: 'buck1' can be controlled by gpio dvs. +- max8997,pmic-buck2-uses-gpio-dvs: 'buck2' can be controlled by gpio dvs. +- max8997,pmic-buck5-uses-gpio-dvs: 'buck5' can be controlled by gpio dvs. + +Additional properties required if either of the optional properties are used: +- max8997,pmic-ignore-gpiodvs-side-effect: When GPIO-DVS mode is used for + multiple bucks, changing the voltage value of one of the bucks may affect + that of another buck, which is the side effect of the change (set_voltage). + Use this property to ignore such side effects and change the voltage. + +- max8997,pmic-buck125-default-dvs-idx: Default voltage setting selected from + the possible 8 options selectable by the dvs gpios. The value of this + property should be between 0 and 7. If not specified or if out of range, the + default value of this property is set to 0. + +- max8997,pmic-buck125-dvs-gpios: GPIO specifiers for three host gpio's used + for dvs. The format of the gpio specifier depends in the gpio controller. + +Regulators: The regulators of max8997 that have to be instantiated should be +included in a sub-node named 'regulators'. Regulator nodes included in this +sub-node should be of the format as listed below. + + regulator_name { + standard regulator bindings here + }; + +The following are the names of the regulators that the max8997 pmic block +supports. Note: The 'n' in LDOn and BUCKn represents the LDO or BUCK number +as per the datasheet of max8997. + + - LDOn + - valid values for n are 1 to 18 and 21 + - Example: LDO0, LD01, LDO2, LDO21 + - BUCKn + - valid values for n are 1 to 7. + - Example: BUCK1, BUCK2, BUCK3, BUCK7 + + - ENVICHG: Battery Charging Current Monitor Output. This
Re: [PATCH 1/3] i2c: exynos5: add High Speed I2C controller driver
> + i2c_del_adapter(&i2c->adap); > + free_irq(i2c->irq, i2c); > + > + clk_disable_unprepare(i2c->clk); > + clk_put(i2c->clk); > + > + iounmap(i2c->regs); > + > + release_resource(i2c->ioarea); > + exynos5_i2c_dt_gpio_free(i2c); > + kfree(i2c->ioarea); > + > + return 0; > +} > + > +#ifdef CONFIG_PM > +static int exynos5_i2c_suspend_noirq(struct device *dev) > +{ > + struct platform_device *pdev = to_platform_device(dev); > + struct exynos5_i2c *i2c = platform_get_drvdata(pdev); > + > + i2c_lock_adapter(&i2c->adap); > + i2c->suspended = 1; > + i2c_unlock_adapter(&i2c->adap); > + > + return 0; > +} > + > +static int exynos5_i2c_resume(struct device *dev) > +{ > + struct platform_device *pdev = to_platform_device(dev); > + struct exynos5_i2c *i2c = platform_get_drvdata(pdev); > + > + i2c_lock_adapter(&i2c->adap); > + clk_prepare_enable(i2c->clk); > + exynos5_i2c_init(i2c); > + clk_disable_unprepare(i2c->clk); > + i2c->suspended = 0; > + i2c_unlock_adapter(&i2c->adap); > + > + return 0; > +} > + > +static const struct dev_pm_ops exynos5_i2c_dev_pm_ops = { > + .suspend_noirq = exynos5_i2c_suspend_noirq, > + .resume_noirq = exynos5_i2c_resume, > +}; > + > +#define EXYNOS5_DEV_PM_OPS (&exynos5_i2c_dev_pm_ops) > +#else > +#define EXYNOS5_DEV_PM_OPS NULL > +#endif > + > +static struct platform_driver exynos5_i2c_driver = { > + .probe = exynos5_i2c_probe, > + .remove = exynos5_i2c_remove, > + .id_table = exynos5_driver_ids, > + .driver = { > + .owner = THIS_MODULE, > + .name = "exynos5-i2c", > + .pm = EXYNOS5_DEV_PM_OPS, > + .of_match_table = of_match_ptr(exynos5_i2c_match), > + }, > +}; > + > +static int __init i2c_adap_exynos5_init(void) > +{ > + return platform_driver_register(&exynos5_i2c_driver); > +} > +subsys_initcall(i2c_adap_exynos5_init); > + > +static void __exit i2c_adap_exynos5_exit(void) > +{ > + platform_driver_unregister(&exynos5_i2c_driver); > +} > +module_exit(i2c_adap_exynos5_exit); > + > +MODULE_DESCRIPTION("Exynos5 HS-I2C Bus driver"); > +MODULE_AUTHOR("Taekgyun Ko, "); > +MODULE_LICENSE("GPL"); > diff --git a/drivers/i2c/busses/i2c-exynos5.h > b/drivers/i2c/busses/i2c-exynos5.h > new file mode 100644 > index 000..063051e > --- /dev/null > +++ b/drivers/i2c/busses/i2c-exynos5.h > @@ -0,0 +1,80 @@ > +/* > + * Copyright (C) 2012 Samsung Electronics Co., Ltd. > + * > + * Exynos5 series HS-I2C Controller > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License version 2 as > + * published by the Free Software Foundation. > +*/ > + > +#ifndef __ASM_ARCH_REGS_HS_IIC_H > +#define __ASM_ARCH_REGS_HS_IIC_H __FILE__ > + > +/* > + * Register Map > + */ > +#define HSI2C_CTL 0x00 > +#define HSI2C_FIFO_CTL 0x04 > +#define HSI2C_TRAILIG_CTL 0x08 > +#define HSI2C_CLK_CTL 0x0C > +#define HSI2C_CLK_SLOT 0x10 > +#define HSI2C_INT_ENABLE 0x20 > +#define HSI2C_INT_STATUS 0x24 > +#define HSI2C_ERR_STATUS 0x2C > +#define HSI2C_FIFO_STATUS 0x30 > +#define HSI2C_TX_DATA 0x34 > +#define HSI2C_RX_DATA 0x38 > +#define HSI2C_CONF 0x40 > +#define HSI2C_AUTO_CONFING 0x44 > +#define HSI2C_TIMEOUT 0x48 > +#define HSI2C_MANUAL_CMD 0x4C > +#define HSI2C_TRANS_STATUS 0x50 > +#define HSI2C_TIMING_HS1 0x54 > +#define HSI2C_TIMING_HS2 0x58 > +#define HSI2C_TIMING_HS3 0x5C > +#define HSI2C_TIMING_FS1 0x60 > +#define HSI2C_TIMING_FS2 0x64 > +#define HSI2C_TIMING_FS3 0x68 > +#define HSI2C_TIMING_SLA 0x6C > +#define HSI2C_ADDR 0x70 > + > +/* I2C_CTL Register */ > +#define HSI2C_FUNC_MODE_I2C(1u << 0) > +#define HSI2C_MASTER (1u << 3) > +#define HSI2C_RXCHON (1u << 6) > +#define HSI2C_TXCHON (1u << 7) > + > +/* I2C_FIFO_CTL Register */ > +#define HSI2C_RXFIFO_EN(1u << 0) > +#define HSI2C_TXFIFO_EN(1u << 1) > +#define HSI2C_TXFIFO_TRIGGER_LEVEL (0x20 << 16) > +#define HSI2C_RXFIFO_TRIGGER_LEVEL (0x20 << 4) > + > +/* I2C_TRAILING_CTL Register */ > +#define HSI2C_TRAILING_COUNT (0xf) > + > +/* I2C_INT_EN Register */ > +#define HSI2C_INT_TX_ALMOSTEMPTY_EN(1u << 0) /* For TX FIFO */ > +#define HSI2C_INT_RX_ALMOSTFULL_EN (1u << 1) /* For RX FIFO */ > +#define HSI2C_INT_TRAILING_EN (1u << 6) > + > +/* I2C_CONF Register */ > +#define HSI2C_AUTO_MODE(1u << 31) > +#define HSI2C_10BIT_ADDR_MODE (1u << 30) > +#define HSI2C_HS_MODE (1u << 29) > + > +/* I2C_AUTO_CONF Register */ > +#define HSI2C_READ_WRITE (1u << 16) > +#define HSI2C_STOP_AFTER_TRANS (1u << 17) > +#define HSI2C_MASTER_RUN (1u << 31) > + > +/* I2C_TIMEOUT Register */ > +#define HSI2C_TIMEOUT_EN (1u << 31) > + > +#define HSI2C_FIFO_EMPTY (0x1000100) > + > +#define HSI2C_FS_BPS 40 > +#define HSI2C_HS_BPS 250 > + > +#endif /* __ASM_ARCH_REGS_HS_IIC_H */ Since these constants are only use in i2c-exynos5.c file, it is better to move these definitions into i2c-exynos5.c file. Thanks, Thomas. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/6 v5] arm highbank: add support for pl320 IPC
Dear Mark Langsdorf, On Tue, 27 Nov 2012 09:04:32 -0600, Mark Langsdorf wrote: > +int ipc_transmit(u32 *data); ipc_transmit() looks to me like a way to generic name to be exposed to the entire kernel. > +extern int pl320_ipc_register_notifier(struct notifier_block *nb); > +extern int pl320_ipc_unregister_notifier(struct notifier_block *nb); Why some "extern" here? You don't have these for the other functions in this header file. Thomas -- Thomas Petazzoni, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [regression] 3.7+ suspend to RAM/offline CPU fails with nmi_watchdog=0 (bisected)
On Wed, 28 Nov 2012, Joseph Salisbury wrote: > On 11/23/2012 08:11 AM, Norbert Warmuth wrote: > > Thomas Gleixner writes: > > > On Wed, 21 Nov 2012, Norbert Warmuth wrote: > > > > 3.7-rc6 booted with nmi_watchdog=0 fails to suspend to RAM or > > > > offline CPUs. It's reproducable with a KVM guest and physical > > > > system. > > > Does the patch below fix it? > > Yes. > > > > - Norbert > > > > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > > > index 9d4c8d5..e3ef521 100644 > > > --- a/kernel/watchdog.c > > > +++ b/kernel/watchdog.c > > > @@ -368,6 +368,9 @@ static void watchdog_disable(unsigned int cpu) > > > { > > > struct hrtimer *hrtimer = &__raw_get_cpu_var(watchdog_hrtimer); > > > + if (!watchdog_enabled) > > > + return; > > > + > > > watchdog_set_prio(SCHED_NORMAL, 0); > > > hrtimer_cancel(hrtimer); > > > /* disable the perf event */ > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majord...@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > Hi Thomas, > > Your patch also fixes a bug[0] reported against Ubuntu. I assume the window > for v3.7 is closed. Will you be submitting this patch for inclusion in v3.8? Sure, with a stable tag so it gets back into 3.7 Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v2 8/8] drm: tegra: Add gr2d device
On 11/28/2012 02:33 PM, Lucas Stach wrote: Am Mittwoch, den 28.11.2012, 15:17 +0200 schrieb Terje Bergström: On 28.11.2012 01:00, Dave Airlie wrote: We generally aim for the first, to stop the gpu from reading/writing any memory it hasn't been granted access to, the second is nice to have though, but really requires a GPU with VM to implement properly. I wonder if we should aim at root only access on Tegra20, and force IOMMU on Tegra30 and fix the remaining issues we have with IOMMU. The firewall turns out to be more complicated than I wished. Biggest problem is that we aim at zero-copy for everything possible, including command streams. Kernel gets a handle to a command stream, but the command stream is allocated by the user space process. So the user space can tamper with the stream once it's been written to the host1x 2D channel. So this is obviously wrong. Userspace has to allocate a pushbuffer from the kernel just as every other buffer, then map it into it's own address space to push in commands. At submit time of the pushbuf kernel has to make sure that userspace is not able to access the memory any more, i.e. kernel shoots down the vma or pagetable of the vma. To me this sounds very expensive. Zapping the page table requires a CPU TLB flush on all cores that have touched the buffer, not to mention the kernel calls required to set up the page table once the buffer is reused. If this usage scheme then is combined with a command verifier or "firewall" that reads from a *write-combined* pushbuffer performance will be bad. Really bad. In such situations I think one should consider copy-from-user while validating, and let user-space set up the command buffer in malloced memory. /Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.6.7-rt18 ARM BUG_ON() at kernel/sched/core.c:3817
On Wed, 28 Nov 2012, Frank Rowand wrote: > 3.6.7-rt18: kernel BUG at .../kernel/sched/core.c:3817! > > Grant reported this same problem for 3.6.5-rt15. > > I am seeing it on a different arm board. > > Here is the BUG_ON(): > >asmlinkage void __sched preempt_schedule_irq(void) >{ > struct thread_info *ti = current_thread_info(); > > /* Catch callers which need to be fixed */ > BUG_ON(ti->preempt_count || !irqs_disabled()); > > Putting in some extra printk(), the BUG_ON() is triggering because > ti->preempt_count is non-zero. > > > It appears that the cause is in arch/arm/kernel/entry-armv.S. > > The call to preempt_schedule_irq() is from svc_preempt: > >#ifdef CONFIG_PREEMPT >svc_preempt: >mov r8, lr >1: bl preempt_schedule_irq@ irq en/disable is done > inside > > > svc_preempt is branched to from one of two possible places. The first was > present before the lazy preempt code was added. The first appears ok to me. > (Note that the first branch does not occur if preempt count is non-zero.) > > The second branch can occur even if the preempt count is non-zero (which is > what the BUG_ON() is finding): > >__irq_svc: >svc_entry >irq_handler > >#ifdef CONFIG_PREEMPT >get_thread_info tsk >ldr r8, [tsk, #TI_PREEMPT] @ get preempt count >ldr r0, [tsk, #TI_FLAGS]@ get flags >teq r8, #0 @ if preempt count != 0 >movne r0, #0 @ force flags to 0 >tst r0, #_TIF_NEED_RESCHED >blnesvc_preempt >ldr r8, [tsk, #TI_PREEMPT_LAZY] @ get preempt lazy count >ldr r0, [tsk, #TI_FLAGS]@ get flags >teq r8, #0 @ if preempt lazy count != > 0 >movne r0, #0 @ force flags to 0 >tst r0, #_TIF_NEED_RESCHED_LAZY >blnesvc_preempt >#endif Bah. I knew that I had messed up the ASM magic. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)
My laptop is an Acer 1810T. I see this error message each boot. Kind regards Thomas Jiri Kosina schrieb: >On Fri, 15 Mar 2013, Jiri Kosina wrote: > >> > I have the same problem on my Lenovo T500. I think the graphics card is >> > involved. >> > >> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI >> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16: >> > nobody cared" during boot, not when I boot with the ATI card. >> >> Confirming this. After a lot of hassle, I have bisected this reliably to >> >> commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6 >> Author: Daniel Vetter >> Date: Sat Dec 1 13:53:45 2012 +0100 >> >> drm/i915: use the gmbus irq for waits >> >> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's >> happening in parallel. >> >> Attaching dmesg.txt from the machine with 28c70f162a as head, with >> drm.debug=0xe. > >Just a datapoint -- I have put a trivial debugging patch in place, and it >reveals that "nobody cared" for irq 16 happens long after last > > I915_WRITE(GMBUS4 + reg_offset, 0); > >has been performed in gmbus_wait_hw_status(). On the other hand, if I >comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), >then it of course falls back to GPIO bit-banging, but the "nobody cared" >for irq 16 is gone. > >So it seems like something gets severely confused by the I915_WRITE to >GMBUS4 + reg_offset. So far this seems to have been reported solely on >Lenovos as far as I can see (although a completely different types), so it >might be some platform-specific quirk? > >Honestly, I still don't understand how all the GMBUS stuff relates to IRQ >16 at all. > >-- >Jiri Kosina >SUSE Labs
Re: [RFC PATCH v2] of/pci: Provide support for parsing PCI DT ranges property
Hello, What is the status of the below patch? Both Marvell PCIe driver and Tegra PCIe driver need a way to parse the ranges = <...> property of the PCI DT node. Would it be possible to get this patch merged for 3.10, or get some review comments that would allow us to rework it in time for 3.10 ? Thanks, Thomas On Thu, 21 Feb 2013 15:47:09 +, Andrew Murray wrote: > DT bindings for PCI host bridges often use the ranges property to describe > memory and IO ranges - this binding tends to be the same across architectures > yet several parsing implementations exist, e.g. arch/mips/pci/pci.c, > arch/powerpc/kernel/pci-common.c, arch/sparc/kernel/pci.c and > arch/microblaze/pci/pci-common.c (clone of PPC). Some of these duplicate > functionality provided by drivers/of/address.c. > > This patch factors out common implementations patterns to reduce overall > kernel > code and provide a means for host bridge drivers to directly obtain struct > resources from the DT's ranges property without relying on architecture > specific > DT handling. This will make it easier to write archiecture independent host > bridge > drivers and mitigate against further duplication of DT parsing code. > > This patch can be used in the following way: > > struct of_pci_range_iter iter; > for_each_of_pci_range(&iter, np) { > > //directly access properties of the address range, e.g.: > //iter.pci_space, iter.pci_addr, iter.cpu_addr, iter.size or > //iter.flags > > //alternatively obtain a struct resource, e.g.: > //struct resource res; > //range_iter_fill_resource(iter, np, res); > } > > Additionally the implementation takes care of adjacent ranges and merges them > into a single range (as was the case with powerpc and microblaze). > > The modifications to microblaze, mips and powerpc have not been tested. > > v2: > - This follows on from suggestions made by Grant Likely > (marc.info/?l=linux-kernel&m=136079602806328) > > Signed-off-by: Andrew Murray > Signed-off-by: Liviu Dudau > --- > arch/microblaze/pci/pci-common.c | 100 +++-- > arch/mips/pci/pci.c | 44 - > arch/powerpc/kernel/pci-common.c | 93 ++- > drivers/of/address.c | 54 > include/linux/of_address.h | 30 +++ > 5 files changed, 151 insertions(+), 170 deletions(-) > > diff --git a/arch/microblaze/pci/pci-common.c > b/arch/microblaze/pci/pci-common.c > index 4dbb505..ccc0d63 100644 > --- a/arch/microblaze/pci/pci-common.c > +++ b/arch/microblaze/pci/pci-common.c > @@ -659,67 +659,37 @@ void __devinit pci_process_bridge_OF_ranges(struct > pci_controller *hose, > struct device_node *dev, > int primary) > { > - const u32 *ranges; > - int rlen; > - int pna = of_n_addr_cells(dev); > - int np = pna + 5; > int memno = 0, isa_hole = -1; > - u32 pci_space; > - unsigned long long pci_addr, cpu_addr, pci_next, cpu_next, size; > unsigned long long isa_mb = 0; > struct resource *res; > + struct of_pci_range_iter iter; > > printk(KERN_INFO "PCI host bridge %s %s ranges:\n", > dev->full_name, primary ? "(primary)" : ""); > > - /* Get ranges property */ > - ranges = of_get_property(dev, "ranges", &rlen); > - if (ranges == NULL) > - return; > - > - /* Parse it */ > pr_debug("Parsing ranges property...\n"); > - while ((rlen -= np * 4) >= 0) { > + for_each_of_pci_range(&iter, dev) { > /* Read next ranges element */ > - pci_space = ranges[0]; > - pci_addr = of_read_number(ranges + 1, 2); > - cpu_addr = of_translate_address(dev, ranges + 3); > - size = of_read_number(ranges + pna + 3, 2); > - > pr_debug("pci_space: 0x%08x pci_addr:0x%016llx " > "cpu_addr:0x%016llx size:0x%016llx\n", > - pci_space, pci_addr, cpu_addr, size); > - > - ranges += np; > + iter.pci_space, iter.pci_addr, iter.cpu_addr, > + iter.size); > > /* If we failed translation or got a zero-sized region >* (some FW try to feed us with non sensical zero sized regions >* such as power3 which look like some kind of attem
Re: [PATCH 2/2] netlink: Diag core and basic socket info dumping
On 03/21/13 at 01:21pm, Andrey Vagin wrote: > diff --git a/include/uapi/linux/netlink_diag.h > b/include/uapi/linux/netlink_diag.h > new file mode 100644 > index 000..9328866 > --- /dev/null > +++ b/include/uapi/linux/netlink_diag.h > +enum { > + NETLINK_DIAG_MEMINFO, > + NETLINK_DIAG_GROUPS, > + > + NETLINK_DIAG_MAX, > +}; Please follow the common pattern and define NETLINK_DIAG_MAX as NETLINK_DIAG_GROUPS like other by doing> [...] __NETLINK_DIAG_MAX, }; #define NETLINK_DIAG_MAX (__NETLINK_DIAG_MAX - 1) Everyone is used to do: struct nlattr *attrs[NETLINK_DIAG_MAX+1]; nla_parse([...], NETLINK_DIAG_MAX, [...] In fact, the follow-up patch to ss is buggy because of this. UNIX_DIAG_MAX suffers from the same problem which is problem the cause for this. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] netlink: Diag core and basic socket info dumping
On 03/21/13 at 06:31pm, Andrew Vagin wrote: > The code in ss looks like you described: > struct rtattr *tb[UNIX_DIAG_MAX+1]; > ... > parse_rtattr(tb, UNIX_DIAG_MAX, (struct rtattr*)(r+1), > nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r))); > > > struct rtattr *tb[NETLINK_DIAG_MAX+1]; > ... > parse_rtattr(tb, NETLINK_DIAG_MAX, (struct rtattr*)(r+1), > nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r))) > > I think I should only update headers... Or I don't understand something. Right, fixing the headers will resolve the issue. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] net: fix *_DIAG_MAX constants
On 03/21/13 at 06:18pm, Andrey Vagin wrote: > Follow the common pattern and define *_DIAG_MAX like: > > [...] > __XXX_DIAG_MAX, > }; > > Because everyone is used to do: > > struct nlattr *attrs[XXX_DIAG_MAX+1]; > > nla_parse([...], XXX_DIAG_MAX, [...] > > Reported-by: Thomas Graf > Cc: "David S. Miller" > Cc: Pavel Emelyanov > Cc: Eric Dumazet > Cc: "Paul E. McKenney" > Cc: David Howells > Signed-off-by: Andrey Vagin Acked-by: Thomas Graf -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] net: fix *_DIAG_MAX constants
On 03/21/13 at 11:14am, David Miller wrote: > So you're ACK'ing a patch that makes changes to files that don't even > exist in the repository? I have been ACK'ing the patch in the context of the previous patch that I reviewed in the first place which in summary is now OK. But you are obviously right that a fixed version of the initial patch should be submitted instead. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH RESEND v2] of/pci: Provide support for parsing PCI DT ranges property
Dear Andrew Murray, On Fri, 1 Mar 2013 12:23:36 +, Andrew Murray wrote: > This patch factors out common implementations patterns to reduce overall > kernel > code and provide a means for host bridge drivers to directly obtain struct > resources from the DT's ranges property without relying on architecture > specific > DT handling. This will make it easier to write archiecture independent host > bridge > drivers and mitigate against further duplication of DT parsing code. > > This patch can be used in the following way: > > struct of_pci_range_iter iter; > for_each_of_pci_range(&iter, np) { > > //directly access properties of the address range, e.g.: > //iter.pci_space, iter.pci_addr, iter.cpu_addr, iter.size or > //iter.flags > > //alternatively obtain a struct resource, e.g.: > //struct resource res; > //range_iter_fill_resource(iter, np, res); > } > > Additionally the implementation takes care of adjacent ranges and merges them > into a single range (as was the case with powerpc and microblaze). > > The modifications to microblaze, mips and powerpc have not been tested. > > v2: > This follows on from suggestions made by Grant Likely > (marc.info/?l=linux-kernel&m=136079602806328) > > Signed-off-by: Andrew Murray > Signed-off-by: Liviu Dudau Thanks, I've tested this successfully with the Marvell PCIe driver. I'm about to send a new version of the Marvell PCIe patch set that includes this RFC proposal. I only made two small changes compared to your version, detailed below. > +#define for_each_of_pci_range(iter, np) \ > + for (; of_pci_process_ranges(iter, np);) In the initial part of the loop, I added a memset() to initialize to zero the "iter" structure. Otherwise, if you forget to do it before calling of_pci_process_ranges(), it may crash (depending on the random values present in the uninitialized structure). > +#define range_iter_fill_resource(iter, np, res) \ > + do { \ > + res->flags = iter.flags; \ > + res->start = iter.cpu_addr; \ > + res->end = iter.cpu_addr + iter.size - 1; \ > + res->parent = res->child = res->sibling = NULL; \ > + res->name = np->full_name; \ > + } while (0) And here, I enclosed all the usage of the macro parameters in parenthesis. Like (res)->flags instead of res->flags. If you don't do that, then passing &foobar as the 'res' parameter causes some compilation failure because &foobar->res is not valid, while (&foobar)->res is. Best regards, Thomas -- Thomas Petazzoni, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: hrtimer possible issue
On Sun, 3 Feb 2013, Izik Eidus wrote: > Hi, > > it seems like hrtimer_enqueue_reprogram contain a race which could result in > timer.base switch during unlock/lock sequence. > > See the code at __hrtimer_start_range_ns where it calls > hrtimer_enqueue_reprogram. The later is releasing lock protecting the timer > base for a short time and timer base switch can occur from a different CPU > thread. Later when __hrtimer_start_range_ns calls unlock_hrtimer_base, a base > switch could have happened and this causes the bug > > Try to start the same hrtimer from two different threads in kernel running > each one on a different CPU. Eventually one of the calls will cause timer base > switch while another thread is not expecting it. > > This can happen in virtualized environment where one thread can be delayed by > lower hypervisor, and due to time delay a different CPU is taking care of > missed timer start and runs the timer start logic on its own. Nice analysis. > This simple patch (just to give example of a fix) refactor this function to > get rid of unneeded lock which immediately was followed by the unlock (with > possible undesired base switch). > > (Both the bug and the fixed were found/patched by Leonid Shatz) The patch got mangled by your mail client and it is missing the proper Signed-off-by annotation in the patch description. See Documentation/SubmittingPatches. Can you please resend ? Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] OF: Fixup resursive locking code paths
On Fri, 25 Jan 2013, Paul Gortmaker wrote: > From: Thomas Gleixner > > There is no real reason to use a rwlock for devtree_lock. It even > could be a mutex, but unfortunately it's locked from cpu hotplug > paths which can't schedule :( > > So it needs to become a raw lock on rt as well. The devtree_lock would > be the only user of a raw_rw_lock, so we are better off cleaning up the > recursive locking paths which allows us to convert devtree_lock to a > read_lock. Hmm. It's already a rw_lock. For RT we want to change that thing to a raw_spinlock. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[ANNOUNCE] 3.6.11-rt26
Dear RT Folks, I'm pleased to announce the 3.6.11-rt26 release. Changes since 3.6.11-rt25: 1) Fix the RT highmem implementation on x86 2) Support highmem + RT on ARM 3) Fix an one off error in the generic highmem code (upstream fix did not make it into 3.6.stable) 4) Upstream SLUB fixes (Christoph Lameter) 5) Fix a few RT issues in mmc and amba drivers 6) Initialize local locks in mm/swap.c early 7) Use simple wait queues for completions. This is a performance improvement. Completions do not have complex callbacks and the wakeup path is disabling interrupts anyway. So using simple wait locks with the raw spinlock is not a latency problem, but the "sleeping lock" in the normal waitqueue is a source for lock bouncing: T1 T2 lock(WQ) wakeup(T2) ---> preemption lock(WQ) pi_boost(T1) wait_for_lock(WQ) unlock(WQ) deboost(T1) ---> preemption The simple waitqueue reduces this to: T1 T2 raw_lock(WQ) wakeup(T2) raw_unlock(WQ) ---> preemption raw_lock(WQ) @Steven: Sorry, I forgot the stable tags on: drivers-tty-pl011-irq-disable-madness.patch mmci-remove-bogus-irq-save.patch idle-state.patch might-sleep-check-for-idle.patch mm-swap-fix-initialization.patch I'm still digging through my mail backlog, so I have not yet decided whether this is the last RT release for 3.6. The delta patch against 3.6.11-rt25 is appended below and can be found here: http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/incr/patch-3.6.11-rt25-rt26.patch.xz The RT patch against 3.6.11 can be found here: http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patch-3.6.11-rt26.patch.xz The split quilt queue is available at: http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patches-3.6.11-rt26.tar.xz Enjoy, tglx -> Index: linux-stable/arch/arm/Kconfig === --- linux-stable.orig/arch/arm/Kconfig +++ linux-stable/arch/arm/Kconfig @@ -1749,7 +1749,7 @@ config HAVE_ARCH_PFN_VALID config HIGHMEM bool "High Memory Support" - depends on MMU && !PREEMPT_RT_FULL + depends on MMU help The address space of ARM processors is only 4 Gigabytes large and it has to accommodate user address space, kernel address Index: linux-stable/arch/x86/mm/highmem_32.c === --- linux-stable.orig/arch/x86/mm/highmem_32.c +++ linux-stable/arch/x86/mm/highmem_32.c @@ -21,6 +21,7 @@ void kunmap(struct page *page) } EXPORT_SYMBOL(kunmap); +#ifndef CONFIG_PREEMPT_RT_FULL /* * kmap_atomic/kunmap_atomic is significantly faster than kmap/kunmap because * no global lock is needed and because the kmap code must perform a global TLB @@ -115,6 +116,7 @@ struct page *kmap_atomic_to_page(void *p return pte_page(*pte); } EXPORT_SYMBOL(kmap_atomic_to_page); +#endif void __init set_highmem_pages_init(void) { Index: linux-stable/include/linux/wait-simple.h === --- linux-stable.orig/include/linux/wait-simple.h +++ linux-stable/include/linux/wait-simple.h @@ -22,12 +22,14 @@ struct swait_head { struct list_headlist; }; -#define DEFINE_SWAIT_HEAD(name)\ - struct swait_head name = { \ +#define SWAIT_HEAD_INITIALIZER(name) { \ .lock = __RAW_SPIN_LOCK_UNLOCKED(name.lock), \ .list = LIST_HEAD_INIT((name).list), \ } +#define DEFINE_SWAIT_HEAD(name)\ + struct swait_head name = SWAIT_HEAD_INITIALIZER(name) + extern void __init_swait_head(struct swait_head *h, struct lock_class_key *key); #define init_swait_head(swh) \ @@ -40,59 +42,25 @@ extern void __init_swait_head(struct swa /* * Waiter functions */ -static inline bool swaiter_enqueued(struct swaiter *w) -{ - return w->task != NULL; -} - +extern void swait_prepare_locked(struct swait_head *head, struct swaiter *w); extern void swait_prepare(struct swait_head *head, struct swaiter *w, int state); +extern void swait_finish_locked(struct swait_head *head, struct swaiter *w); extern void swait_finish(struct swait_head *head, struct swaiter *w); /* - * Adds w to head->list. Must be called with head->lock locked. - */ -static inline void __swait_enqueue(struct swait_head *head, struct swaiter *w) -{ - list_add(&w->node, &head->list); -} - -/* - * Removes w from
Re: [ANNOUNCE] 3.6.11-rt26
On Mon, 4 Feb 2013, Thomas Gleixner wrote: > Dear RT Folks, > > I'm pleased to announce the 3.6.11-rt26 release. > > Changes since 3.6.11-rt25: Forgot to mention the change from EXPORT_SYMBOL_GPL to EXPORT_SYMBOL for pagefault_dis/enable. I really hate it, but it breaks the compilation of ^!@%^$@ drivers which work fine against mainline. Sigh! @Steven: Wants to go into stable-rt as well Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.6.11-rt26
On Mon, 4 Feb 2013, Clark Williams wrote: > More changes; I was running into a collision with the name kmap_prot. Bah. I knew that I should have decided that today is still part of the weekend. Pushed out rt27 with the fixed merged back. Sorry for the noise. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] 3.6.11-rt26
On Tue, 5 Feb 2013, Qiang Huang wrote: > On 2013/2/4 22:58, Thomas Gleixner wrote: > >From patches-3.6.11-rt28.patch.gz, your patch x86-highmem-make-it-work.patch > did this work. And you said > "It had been enabled quite some time, but never really worked." > > But I think there is a previous patch mm-rt-kmap-atomic-scheduling.patch did > the job, so I think RT highmem on x86 should have worked. > > Now with your patch, if we use kmap instead of kmap_atomic on RT, do we need > to revert Peter's patch as well? I should have done that, yes. > I haven't tested it, but if Peter's patch did solved the problem, is his way > better than use kmap? Because we can use more highmem virtual address, > although with some switch latency in some small probability scenarios. In theory it's better. Though I ran into some issues with that approach. It's on my todo list to revisit that problem, but for now the kmap way is at least safer. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: hrtimer possible issue
On Mon, 4 Feb 2013, Leonid Shatz wrote: > I assume the race can also happen between hrtimer cancel and start. In both > cases timer base switch can happen. > > Izik, please check if you can arrange the patch in the standard format (do > we need to do it against latest kernel version?) Yes please. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix hrtimer_enqueue_reprogram race
On Mon, 4 Feb 2013, Izik Eidus wrote: > From: leonid Shatz > > it seems like hrtimer_enqueue_reprogram contain a race which could result in > timer.base switch during unlock/lock sequence. > > See the code at __hrtimer_start_range_ns where it calls > hrtimer_enqueue_reprogram. The later is releasing lock protecting the timer > base for a short time and timer base switch can occur from a different CPU > thread. Later when __hrtimer_start_range_ns calls unlock_hrtimer_base, a base > switch could have happened and this causes the bug > > Try to start the same hrtimer from two different threads in kernel running > each one on a different CPU. Eventually one of the calls will cause timer base > switch while another thread is not expecting it. Aside of the bug in the hrtimer code being a real one, writing code which fiddles with the same resource (hrtimer) unserialized is broken on its own. > This can happen in virtualized environment where one thread can be delayed by > lower hypervisor, and due to time delay a different CPU is taking care of > missed timer start and runs the timer start logic on its own. Without noticing that something else already takes care of it? So you're saying that the code in question relies on magic serialization in the hrtimer code. Doesn't look like a brilliant design. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: clock_nanosleep() task_struct leak
On Tue, 5 Feb 2013, Stanislaw Gruszka wrote: > On Mon, Feb 04, 2013 at 08:32:23PM +0100, Oleg Nesterov wrote: > > On 02/01, Thomas Gleixner wrote: > > > > > > B1;2601;0cOn Fri, 1 Feb 2013, Tommi Rantala wrote: > > > > > > > Hello, > > > > > > > > Trinity discovered a task_struct leak with clock_nanosleep(), > > > > reproducible with: > > > > > > > > -8<-8<-8<- > > > > #include > > > > > > > > static const struct timespec req; > > > > > > > > int main(void) { > > > > return clock_nanosleep(CLOCK_PROCESS_CPUTIME_ID, > > > > TIMER_ABSTIME, &req, NULL); > > > > } > > > > -8<-8<-8<- > > > > posix_cpu_timer_create()->get_task_struct() I guess... > > > > Cough. I am not sure I ever understood this code, but now it certainly > > looks as if I never saw it before. > > Looks on do_cpu_nanosleep() we call posix_cpu_timer_create(), but we do > not call posix_cpu_timer_del() at the end. Fix will not be super simple, > since we need to care about error cases. I can cook a patch if nobody > else want to do this. Would be much appreciated! Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] fix hrtimer_enqueue_reprogram race
Leonid, On Tue, 5 Feb 2013, Leonid Shatz wrote: Please stop top posting! > The explanation were submitted as possible scenario which could explain how > the bug in kernel could happen and it does not mean that serious designer > could do exactly that. As I said before, it's also possible that a race > between hrtimer_cancel and hrtimer_start can trigger the bug. The idea is to > have kernel more robust. I'm not against making the kernel more robust and I already applied the patch. > There are already locks used inside hrtimer code, so why should > users of the hrtimer add another layer of locks and get involved in > the intricacy of which cases are protected by internal hrtimer lock > and which are not? Groan. The hrtimer locks are there to protect the internal data structures of the hrtimer code and to ensure that hrtimer functions are proper protected against concurrent running callbacks. But that does not give you any kind of protection versus multiple users of your hrtimer resource. Look at the following scenario: CPU0CPU1 hrtimer_cancel() hrtimer_start() teardown_crap() hrtimer_callback() runs That's probably not what you want and magic serialization in the hrtimer code does not help at all. There is also no protection against: CPU0CPU1 hrtimer_cancel() hrtimer_start() hrtimer_forward() Which leaves the hrtimer enqueued on CPU1 with a wrong expiry value. So while concurrent hrtimer_start() is protected, other things are not. So do we need to create a list of functions which can be abused by a programmer without proper protection of the resource and which not? If you want to use any kind of resource (including hrtimers) concurrently you better have proper serialization in that code. Everything else is voodoo programming of the worst kind. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 06/14] ARM: pci: Keep pci_common_init() around after init
Dear Thierry Reding, On Wed, 9 Jan 2013 21:43:06 +0100, Thierry Reding wrote: > When using deferred driver probing, PCI host controller drivers may > actually require this function after the init stage. > > Signed-off-by: Thierry Reding Tested-by: Thomas Petazzoni -- Thomas Petazzoni, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 05/32] lib: devres: don't enclose pcim_*() functions in CONFIG_HAS_IOPORT
The pcim_*() functions are used by the libata-sff subsystem, and this subsystem is used for many SATA drivers on ARM platforms that do not necessarily have I/O ports. Signed-off-by: Thomas Petazzoni Cc: Paul Gortmaker Cc: Jesse Barnes Cc: Yinghai Lu Cc: linux-kernel@vger.kernel.org --- lib/devres.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/devres.c b/lib/devres.c index 80b9c76..5639c3e 100644 --- a/lib/devres.c +++ b/lib/devres.c @@ -195,6 +195,7 @@ void devm_ioport_unmap(struct device *dev, void __iomem *addr) devm_ioport_map_match, (void *)addr)); } EXPORT_SYMBOL(devm_ioport_unmap); +#endif /* CONFIG_HAS_IOPORT */ #ifdef CONFIG_PCI /* @@ -400,4 +401,3 @@ void pcim_iounmap_regions(struct pci_dev *pdev, int mask) } EXPORT_SYMBOL(pcim_iounmap_regions); #endif /* CONFIG_PCI */ -#endif /* CONFIG_HAS_IOPORT */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 05/32] lib: devres: don't enclose pcim_*() functions in CONFIG_HAS_IOPORT
Dear Arnd Bergmann, On Tue, 12 Feb 2013 18:00:48 +, Arnd Bergmann wrote: > On Tuesday 12 February 2013, Thomas Petazzoni wrote: > > The pcim_*() functions are used by the libata-sff subsystem, and > > this subsystem is used for many SATA drivers on ARM platforms that > > do not necessarily have I/O ports. > > > > Signed-off-by: Thomas Petazzoni > > Cc: Paul Gortmaker > > Cc: Jesse Barnes > > Cc: Yinghai Lu > > Cc: linux-kernel@vger.kernel.org > > Sorry, but this patch is still incorrect. I know, but the discussion was so huge on the first posting that it was basically impossible to draw a conclusion out of it. > Any driver that requires a > linear mapping of I/O ports to __iomem pointers must depend > CONFIG_HAS_IOPORT with the current definition of that symbol (as > mentioned before, we should really rename that to > CONFIG_HAS_IOPORT_MAP). Having these functions not defined is a > compile time check that is necessary to ensure that all drivers have > the correct annotation. I have the feeling that the problem is more complex than that. My understanding is that the pcim_iomap_regions() function used by drivers/ata/libata-sff.c can perfectly be used to map memory BARs, and not necessarily I/O BARs. Therefore, this driver can perfectly be used in an architecture where CONFIG_NO_IOPORT is selected. The thing is that pcim_iomap_regions() transparently allows to remap an I/O BAR is such a BAR is passed as argument, or a memory BAR if such a BAR is passed as argument. Therefore, I continue to believe that the pcim_*() functions are useful even if the platform doesn't have CONFIG_HAS_IOPORT. Best regards, Thomas -- Thomas Petazzoni, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V2 2/2] cpufreq: AMD "frequency sensitivity feedback" powersave bias for ondemand governor
On Thursday, March 28, 2013 01:24:17 PM Jacob Shin wrote: > Future AMD processors, starting with Family 16h, can provide software > with feedback on how the workload may respond to frequency change -- > memory-bound workloads will not benefit from higher frequency, where > as compute-bound workloads will. This patch enables this "frequency > sensitivity feedback" to aid the ondemand governor to make better > frequency change decisions by hooking into the powersave bias. If I read this correctly, nothing changes even if the driver is loaded, unless user modifies: /sys/devices/system/cpu/cpufreq/ondemand/powersave_bias is this correct? I wonder who should modify: /sys/devices/system/cpu/cpufreq/ondemand/powersave_bias Even cpupower is not aware of this very specific tunable. Also, are you sure cpufreq subsystem will be the only user of this one? Or could cpuidle or others also make use of this somewhen in the future? Then this could more be done like: drivers/cpufreq/mperf.c And scheduler, cpuidle, cpufreq or whatever could use this as well. Just some thinking: I wonder how one could check/verify that the right thing is done (by CPU and kernel). Ideally it would be nice to have the CPU register appended to a cpufreq or cpuidle event trace. But this very (AMD or X86 only?) specific data would not look nice there. An arch placeholder value would be needed or similar? ... > +} > + > +static int __init amd_freq_sensitivity_init(void) > +{ > + int i; > + u32 eax, edx, dummy; > + > + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) > + return -ENODEV; > + > + cpuid(0x8007, &eax, &dummy, &dummy, &edx); If this really should be a separate module: Does/will Intel have the same (feature/cpuid bit)? Anyway, this should get a general AMD or X86 CPU capability flag. Then you can also autoload this driver similar to how it's done in acpi- cpufreq: static const struct x86_cpu_id acpi_cpufreq_ids[] = { X86_FEATURE_MATCH(X86_FEATURE_ACPI), X86_FEATURE_MATCH(X86_FEATURE_HW_PSTATE), {} }; MODULE_DEVICE_TABLE(x86cpu, acpi_cpufreq_ids); Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V3 2/2] cpufreq: AMD "frequency sensitivity feedback" powersave bias for ondemand governor
On Tuesday, April 02, 2013 03:03:37 PM Jacob Shin wrote: > On Tue, Apr 02, 2013 at 09:23:52PM +0200, Borislav Petkov wrote: > > On Tue, Apr 02, 2013 at 01:11:44PM -0500, Jacob Shin wrote: > > > Future AMD processors, starting with Family 16h, can provide software > > > with feedback on how the workload may respond to frequency change -- > > > memory-bound workloads will not benefit from higher frequency, where > > > as compute-bound workloads will. This patch enables this "frequency > > > sensitivity feedback" to aid the ondemand governor to make better > > > frequency change decisions by hooking into the powersave bias. I had a quick look at the specification of these registers. So this seem to be designed and stay very cpufreq specific and other kernel parts probably won't make use of it. ... > > > + > > > + /* this workload is not CPU bound, so choose a lower freq */ > > > + if (sensitivity < od_tuners->powersave_bias) { > > > > Ok, I still didn't get an answer to that: don't we want to use this > > feature by default, even without looking at ->powersave_bias? I mean, > > with feedback from the hardware, we kinda know better than the user, no? > > Well, so this powersave_bias also works as a tunable knob. > > From ondemand side, if /sys/../ondemand/powersave_bias is 0, then we > (AMD sensitivity) don't get called and you get the default ondemand > behavior. > > Like existing powersave_bias, users can tune the value to whatever > they want, to get a specturum of less to more aggressive power savings > vs performance. I understand powersave_bias code to only be able to do a more aggressive power saving way: If you pass 900, a frequency of 90% (for example 900MHz instead of 1000MHz) of the one ondemand typically would choose is taken. powersave_bias values above 1000 (take higher frequencies than the ondemand would take) are not allowed. powersave_bias is undocumented in Documentation/cpu-freq/... I guess its use-case is for people who want to get some percent more power savings out of their laptop and do not care of the one or other percent performance. In fact I would like to get rid of this extra code and I expect nobody would miss it. I might miss a configuration tool where someone went through the code, documented things and allows users to set powersave_bias values through some /etc/* config files. If so, please point me to it. What your patch misses are some hints how and when to use this at all. What value should a user write to powersave_bias tunable to activate your stuff? I guess it's also for laptop users to get some percent more battery out of their platform and this with an even higher performance rate? Server guys do not care for some percent of power, but they do care for some percent of performance. > I thought tunable would be more flexible .. out in the field or what > not .. no? Yep, if you want anyone to make use of this, it should better get embedded in more general, at least general ondemand code. Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] timer: Fix possible issues with non serialized timer_pending( )
Vineet, On Fri, 29 Mar 2013, Vineet Gupta wrote: > When stress testing ARC Linux from 3.9-rc3, we've hit a serialization > issue when mod_timer() races with itself. This is on a FPGA board and > kernel .config among others has !SMP and !PREEMPT_COUNT. > > The issue happens in mod_timer( ) because timer_pending( ) based early > exit check is NOT done inside the timer base spinlock - as a networking > optimization. > > The value used in there, timer->entry.next is also used further in call > chain (all inlines though) for actual list manipulation. However if the > register containing this pointer remains live across the spinlock (in a > UP setup with !PREEMPT_COUNT there's nothing forcing gcc to reload) then > a stale value of next pointer causes incorrect list manipulation, > observed with following sequence in our tests. > > (0). tv1[x] <> t1 <---> t2 > (1). mod_timer(t1) interrupted after it calls timer_pending() > (2). mod_timer(t2) completes > (3). mod_timer(t1) resumes but messes up the list. > (4). __runt_timers( ) uses bogus timer_list entry / crashes in > timer->function > > The simplest fix is to NOT rely on spinlock based compiler barrier but > add an explicit one in timer_pending() That's simple, but dangerous. There is other code which relies on the implicit barriers of spinlocks, so I think we need to add the barrier to the !PREEMPT_COUNT implementation of preempt_*() macros. Thanks, tglx > FWIW, the relevant ARCompact disassembly of mod_timer which clearly > shows the issue due to register reuse is: > > mod_timer: > push_s blink > mov_s r13,r0 # timer, timer > > ... > ## timer_pending( ) > ld_s r3,[r13] # <-- .entry.next LOADED > brne r3, 0, @.L163 > > .L163: > > ## spin_lock_irq( ) > lr r5, [status32] # flags > bic r4, r5, 6 # temp, flags, > and.f 0, r5, 6 # flags, > flag.nz r4 > > ## detach_if_pending( ) begins > > tst_s r3,r3 <-- > # timer_pending( ) checks timer->entry.next > # r3 is NOT reloaded by gcc, using stale value > beq.d @.L169 > mov.eq r0,0 > > # detach_timer( ): __list_del( ) > > ld r4,[r13,4] # .entry.prev, D.31439 > st r4,[r3,4] # .prev, D.31439 > st r3,[r4]# .next, D.30246 > > Signed-off-by: Vineet Gupta > Reported-by: Christian Ruppert > Cc: Thomas Gleixner > Cc: Christian Ruppert > Cc: Pierrick Hascoet > Cc: linux-kernel@vger.kernel.org > --- > include/linux/timer.h | 11 ++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/include/linux/timer.h b/include/linux/timer.h > index 8c5a197..1537104 100644 > --- a/include/linux/timer.h > +++ b/include/linux/timer.h > @@ -168,7 +168,16 @@ static inline void init_timer_on_stack_key(struct > timer_list *timer, > */ > static inline int timer_pending(const struct timer_list * timer) > { > - return timer->entry.next != NULL; > + int pending = timer->entry.next != NULL; > + > + /* > + * The check above enables timer fast path - early exit. > + * However most of the call sites are not protected by timer->base > + * spinlock. If the caller (say mod_timer) races with itself, it > + * can use the stale "next" pointer. See commit log for details. > + */ > + barrier(); > + return pending; > } > > extern void add_timer_on(struct timer_list *timer, int cpu); > -- > 1.7.10.4 > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH V4 0/2] cpufreq: ondemand: add AMD specific powersave bias
On Thursday, April 04, 2013 11:19:02 AM Jacob Shin wrote: > This patchset adds AMD specific powersave bias function to the ondemand > governor; which can be used to help ondemand governor make more power > conscious frequency change decisions based on feedback from hardware > (availble on AMD Family 16h and above). Either the one way: 1) Documenting powersave_bias and add the stuff there, best with a default set so that the stuff gets used or 2) Marking powersave_bias deprecated and embed things into ondemand directly should be fine. As you give this some usefulness now and it's going to get used (automatically) and the stuff is even documented, I cannot suggest anything anymore how to integrate that better. Acked-by: Thomas Renninger -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/22] x86, ACPI, numa: Parse numa info early
On Thursday, April 04, 2013 04:46:04 PM Yinghai Lu wrote: > One commit that tried to parse SRAT early get reverted before v3.9-rc1. > > | commit e8d1955258091e4c92d5a975ebd7fd8a98f5d30f > | Author: Tang Chen > | Date: Fri Feb 22 16:33:44 2013 -0800 > | > |acpi, memory-hotplug: parse SRAT before memblock is ready > > It broke several things, like acpi override and fall back path etc. > > This patchset is clean implementation that will parse numa info early. I tried acpi table overriding, but it did not work for me. In your tree there seem to miss acpi initrd overriding doku: Documentation/acpi/initrd_table_override.txt ? And your tree is 3.6.0-rc6-default+ based, right? I tried it like that: mkdir -p kernel/firmware/acpi cp dsdt.aml kernel/firmware/acpi find kernel | cpio -H newc --create > /boot/instrumented_initrd cat /boot/initrd >>/boot/instrumented_initrd modified /boot/grub/menu.lst and pointed to /boot/instrumented_initrd -> no override messages in dmesg, no overriding happened at all. Did I oversee something? Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/22] x86, ACPI, numa: Parse numa info early
On Thursday, April 04, 2013 08:09:46 PM Yinghai Lu wrote: > On Thu, Apr 4, 2013 at 7:28 PM, Thomas Renninger wrote: > > On Thursday, April 04, 2013 04:46:04 PM Yinghai Lu wrote: > >> One commit that tried to parse SRAT early get reverted before v3.9-rc1. > >> > >> | commit e8d1955258091e4c92d5a975ebd7fd8a98f5d30f > >> | Author: Tang Chen > >> | Date: Fri Feb 22 16:33:44 2013 -0800 > >> | > >> |acpi, memory-hotplug: parse SRAT before memblock is ready > >> > >> It broke several things, like acpi override and fall back path etc. > >> > >> This patchset is clean implementation that will parse numa info early. > > > > I tried acpi table overriding, but it did not work for me. > > In your tree there seem to miss acpi initrd overriding doku: > > Documentation/acpi/initrd_table_override.txt > > http://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/tree/D > ocumentation/acpi/initrd_table_override.txt?h=for-x86-mm > > And your tree is 3.6.0-rc6-default+ based, right? > > It is in for-x86-mm branch, should be 3.9-rc5 based. > > http://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/tree/D > ocumentation/acpi/initrd_table_override.txt?h=for-x86-mm > > can you try > > git checkout -b for-x86-mm origin/for-x86-mm Argh stupid, I simply put a git clone before: could be found at: git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-x86-mm I doubt I will make it today, so I'll try to give it a test on Mo. Thanks, Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/22] x86, ACPI, numa: Parse numa info early
On Thursday, April 04, 2013 08:09:46 PM Yinghai Lu wrote: ... > can you try > > git checkout -b for-x86-mm origin/for-x86-mm That worked out much better :) I see these changes in e820 table, the first part is probably unrelated: BIOS-e820: [mem 0x-0x0009bbff] usable ... BIOS-e820: [mem 0x0010-0xba294fff] usable modified: [mem 0x-0x0fff] reserved modified: [mem 0x1000-0x0009bbff] usable modified: [mem 0x0010-0xba27bfff] usable ... modified: [mem 0xba27c000-0xba2947fc] ACPI data modified: [mem 0xba2947fd-0xba294fff] usable And the ACPI data section where the modified tables are placed seem to get correctly inserted at: 0xba27c000-0xba2947fc -> 0x187FC == 100,348 bytes DSDT and FACP (better known as FADT) I passed have a size of (see dmesg parts below): 0x18709 + 0xF4 bytes = 100,349 bytes. Ah wait the 0xba2947fc is inclusive, so it should exactly fit. I then see: DSDT ACPI table found in initrd [0x378f5208-0x3790d910] FACP ACPI table found in initrd [0x3790d9a0-0x3790da93] ACPI: RSDP 000f0410 00024 (v02 INTEL) ACPI: XSDT bdf24d98 0008C (v01 INTEL ROMLEY 06222004 INTL 20090903) ACPI: Override [FACP- ROMLEY], this is unsafe: tainting kernel Disabling lock debugging due to kernel taint ACPI: FACP bdf24a98 Physical table override, new table: ff4af709 ACPI: FACP ba294709 000F4 (v04 INTEL ROMLEY 06222004 INTL 20121220) ACPI BIOS Bug: Warning: Invalid length for FADT/Pm1aControlBlock: 32, using default 16 (20130117/tbfadt-649) ACPI: Override [DSDT- ROMLEY], this is unsafe: tainting kernel ACPI: DSDT bdf09018 Physical table override, new table: ff4af000 ACPI: DSDT ba27c000 18709 (v02 INTEL ROMLEY 0021 INTL 20121220) Later I see my debug string added to the DSDT when the PCI Routing Table (_PRT) is processed: [9.505419] [ACPI Debug] String [0x0A] "XX" And taking the FADT from /sys/firmware/acpi/tables/FACP: my: PM Profile : 04 [Enterprise Server] changed (as expected) to: PM Profile : 02 [Mobile] >From acpi overriding parts: Tested-by: Thomas Renninger I also went through the override related patches and from what I can judge (certainly not the early memory, flat 32 bit memory you call it? specific parts), they look fine. Nice work! Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sched_clock: Prevent 64bit inatomicity on 32bit systems
The sched_clock_remote() implementation has the following inatomicity problem on 32bit systems when accessing the remote scd->clock, which is a 64bit value. CPU0CPU1 sched_clock_local() sched_clock_remote(CPU0) ... remote_clock = scd[CPU0]->clock read_low32bit(scd[CPU0]->clock) cmpxchg64(scd->clock,...) read_high32bit(scd[CPU0]->clock) While the update of scd->clock is using an atomic64 mechanism, the readout on the remote cpu is not, which can cause completely bogus readouts. It is a quite rare problem, because it requires the update to hit the narrow race window between the low/high readout and the update must go across the 32bit boundary. The resulting misbehaviour is, that CPU1 will see the sched_clock on CPU1 ~4 seconds ahead of it's own and update CPU1s sched_clock value to this bogus timestamp. This stays that way due to the clamping implementation for about 4 seconds until the synchronization with CLOCK_MONOTONIC undoes the problem. The issue is hard to observe, because it might only result in a less accurate SCHED_OTHER timeslicing behaviour. To create observable damage on realtime scheduling classes, it is necessary that the bogus update of CPU1 sched_clock happens in the context of an realtime thread, which then gets charged 4 seconds of RT runtime, which results in the RT throttler mechanism to trigger and prevent scheduling of RT tasks for a little less than 4 seconds. So this is quite unlikely as well. The issue was quite hard to decode as the reproduction time is between 2 days and 3 weeks and intrusive tracing makes it less likely, but the following trace recorded with trace_clock=global, which uses sched_clock_local(), gave the final hint: -0 0d..30 400269.477150: hrtimer_cancel: hrtimer=0xf7061e80 -0 0d..30 400269.477151: hrtimer_start: hrtimer=0xf7061e80 ... irq/20-S-587 1d..32 400273.772118: sched_wakeup: comm= ... target_cpu=0 -0 0dN.30 400273.772118: hrtimer_cancel: hrtimer=0xf7061e80 What happens is that CPU0 goes idle and invokes sched_clock_idle_sleep_event() which invokes sched_clock_local() and CPU1 runs a remote wakeup for CPU0 at the same time, which invokes sched_remote_clock(). The time jump gets propagated to CPU0 via sched_remote_clock() and stays stale on both cores for ~4 seconds. There are only two other possibilities, which could cause a stale sched clock: 1) ktime_get() which reads out CLOCK_MONOTONIC returns a sporadic wrong value. 2) sched_clock() which reads the TSC returns a sporadic wrong value. #1 can be excluded because sched_clock would continue to increase for one jiffy and then go stale. #2 can be excluded because it would not make the clock jump forward. It would just result in a stale sched_clock for one jiffy. After quite some brain twisting and finding the same pattern on other traces, sched_clock_remote() remained the only place which could cause such a problem and as explained above it's indeed racy on 32bit systems. So while on 64bit systems the readout is atomic, we need to verify the remote readout on 32bit machines. We need to protect the local->clock readout in sched_clock_remote() on 32bit as well because an NMI could hit between the low and the high readout, call sched_clock_local() and modify local->clock. Thanks to Siegfried Wulsch for bearing with my debug requests and going through the tedious tasks of running a bunch of reproducer systems to generate the debug information which let me decode the issue. Reported-by: Siegfried Wulsch Signed-off-by: Thomas Gleixner Cc: sta...@vger.kernel.org --- Index: linux-stable/kernel/sched/clock.c === --- linux.orig/kernel/sched/clock.c +++ linux/kernel/sched/clock.c @@ -176,10 +176,36 @@ static u64 sched_clock_remote(struct sch u64 this_clock, remote_clock; u64 *ptr, old_val, val; +#if BITS_PER_LONG != 64 +again: + /* +* Careful here: The local and the remote clock values need to +* be read out atomic as we need to compare the values and +* then update either the local or the remote side. So the +* cmpxchg64 below only protects one readout. +* +* We must reread via sched_clock_local() in the retry case on +* 32bit as an NMI could use sched_clock_local() via the +* tracer and hit between the readout of +* the low32bit and the high 32bit portion. +*/ + this_clock = sched_clock_local(my_scd); + /* +* We must enforce atomic readout on 32bit, otherwise the +* update on the remote cpu can hit inbetween the readout of +* the low32bit and the high 32bit portion. +*/ + remote_clock = cmpxchg64(&scd->clock, 0, 0); +#else + /* +* On 64bit the read of [my]scd->clock is atomic versus the +* u
Re: [PATCH 2/2] x86 e820: Introduce memmap=resetusablemap for kdump usage
On Wednesday, January 23, 2013 08:07:19 PM Yinghai Lu wrote: > On Tue, Jan 22, 2013 at 12:06 PM, Yinghai Lu wrote: > > On Tue, Jan 22, 2013 at 8:32 AM, H. Peter Anvin wrote: > >>> Again: Please explain what is bad with this solution. > >>> I cannot see a better and more robust way for kdump other than > >>> reserving the original reserved memory areas as declared by the BIOS. > >> > >> It is bad because it creates more complexity than is needed. > >> > >> The whole point is that what we want is simply to switch type 1 to type > >> X, with the sole exceptions being the areas explicitly reserved for the > >> kdump kernel. > > > > Do you prefer to "reserveram" way in attached patch? > > Hi, Thomas, > > Can you please check attached reserveram version on your setup? > > If it is ok, i will put it in for-x86-boot patchset and send it to > Peter for v3.9. But this (converting usable memory to reserved one before usable kdump memory is added) will let machines run into problems again for which the check: "mmconf area must be in reserved memory" got added? If, then memory which was usable before has to be converted to a special E820_KUMP (or whatever type) to make sure existing checks which look for "is reserved memory" still work the same way as in a productive kernel. Advantage of this would be that the info what originally was usable memory is preserved and can be used in future kdump related patches. So I guess the final patch should be: - Add a new e820 type: E820_KDUMP_RESERVED /* Originally usable memory where the crashed kernel kernel resided in */ - Use Yinghai's last posted patch, but instead of: + e820_update_range(0, ULLONG_MAX, E820_RAM, + E820_RESERVED); ... + e820_remove_range(start_at, mem_size, E820_RESERVED, 0); do: + e820_update_range(0, ULLONG_MAX, E820_RAM, + E820_KDUMP_RESERVED); ... + e820_remove_range(start_at, mem_size, E820_KDUMP_RESERVED, 0); - Come up with another memmap=kdump_reserve_ram memmap option name or however it should get named... If this proposal gets accepted, I can send a tested patch... Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump with iommu
On Thursday, January 24, 2013 09:23:14 AM Takao Indoh wrote: > (2013/01/23 9:47), Thomas Renninger wrote: > > On Monday, January 21, 2013 10:11:04 AM Takao Indoh wrote: > >> (2013/01/08 4:09), Thomas Renninger wrote: > > ... > > > >>> I tried the provided patches first on 2.6.32, then I verfied with > >>> 3.8-rc2 > >>> and in both cases the disk is not detected anymore in > >>> reset_devices (kexec'ed/kdump) case (but things work fine without these > >>> patches). > >> > >> So the problem that the disk is not detected was caused by exactmap > >> problem you guys are discussing? Or still not detected even if exactmap > >> problem is fixed? > > > > This problem is related to the 5 PCI resetting patches. > > Dumping worked with a 2.6.32 and a 3.8-rc2 kernel, adding the PCI > > resetting > > patches broke both. I first tried 2.6.32 and verified with 3.8-rc2 to make > > sure I didn't mess up the backport adjustings of the patches to 2.6.32. > If you have a chance please try again the patches with the latest > firmware. Not sure I can update the firmware as this is a reference platform used exactly like this in production. I also cannot see how this could help. Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
the patch "perf tools: Update Makefile for Android" broke 3.8-rc perf build.
Linux Kernel Mailing List skrev 12.12.2012 05:13: Gitweb: http://git.kernel.org/linus/;a=commit;h=d816ec2d1bea55cfeac373f0ab0ab8a3105e49b4 Commit: d816ec2d1bea55cfeac373f0ab0ab8a3105e49b4 Parent: 78da39faf7c903bb6e3c20a726fde1bf98d10af8 Author: Irina Tirdea AuthorDate: Mon Oct 8 09:43:27 2012 +0300 Committer: Arnaldo Carvalho de Melo CommitDate: Mon Oct 8 17:42:16 2012 -0300 perf tools: Update Makefile for Android For cross-compiling on Android, some specific changes are needed in the Makefile. The above patch broke perf build on i586 and x86_64: [tmb@tmb linux-3.8-rc5]$ make -C tools/perf -s V=1 HAVE_CPLUS_DEMANGLE=1 prefix=%{_prefix} all CHK -fstack-protector-all CHK -Wstack-protector CHK -Wvolatile-register-var CHK bionic :1:31: fatal error: android/api-level.h: No such file or directory compilation terminated. This is a regression since 3.7 -- Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 11/74] perf python: Fix breakage introduced by the test_attr infrastructure
Arnaldo Carvalho de Melo skrev 24.1.2013 22:07: From: Arnaldo Carvalho de Melo The test_attr infrastructure hooks on the sys_perf_event_open call, checking if a variable is set and if so calling a function to intercept calls and do the checking. But both the variable and the function aren't on objects that are linked on the python binding, breaking it: Atleast this one is 3.8 material as it is a regression since 3.7 [tmb@tmb linux-3.8-rc5]$ make -C tools/perf -s V=1 HAVE_CPLUS_DEMANGLE=1 prefix=%{_prefix} all python_ext_build/tmp/util/evsel.o: In function `sys_perf_event_open': /mnt/work/Mageia/RPM/1Work/kerenels/linux-3.8-rc5/tools/perf/util/../perf.h:183: undefined reference to `test_attr__enabled' /mnt/work/Mageia/RPM/1Work/kerenels/linux-3.8-rc5/tools/perf/util/../perf.h:184: undefined reference to `test_attr__open' collect2: ld returned 1 exit status error: command 'gcc' failed with exit status 1 -- Thomas # perf test -v 15 15: Try 'use perf' in python, checking link problems : --- start --- Traceback (most recent call last): File "", line 1, in ImportError: /home/acme/git/build/perf//python/perf.so: undefined symbol: test_attr__enabled end Try 'use perf' in python, checking link problems: FAILED! # Fix it by moving the variable to one of the linked object files and providing a stub for the function in the python.o object, that is only linked in the python binding. Now 'perf test' is happy again: # perf test 15 15: Try 'use perf' in python, checking link problems : Ok # Cc: David Ahern Cc: Frederic Weisbecker Cc: Jiri Olsa Cc: Mike Galbraith Cc: Namhyung Kim Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Stephane Eranian Link: http://lkml.kernel.org/n/tip-0rsca2kn44b38rgdpr3tz...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/tests/attr.c | 2 -- tools/perf/util/python.c | 9 + tools/perf/util/util.c | 2 ++ 3 files changed, 11 insertions(+), 2 deletions(-) diff --git a/tools/perf/tests/attr.c b/tools/perf/tests/attr.c index 25638a9..05b5acb 100644 --- a/tools/perf/tests/attr.c +++ b/tools/perf/tests/attr.c @@ -33,8 +33,6 @@ extern int verbose; -bool test_attr__enabled; - static char *dir; void test_attr__init(void) diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c index a2657fd..925e0c3 100644 --- a/tools/perf/util/python.c +++ b/tools/perf/util/python.c @@ -1045,3 +1045,12 @@ error: if (PyErr_Occurred()) PyErr_SetString(PyExc_ImportError, "perf: Init failed!"); } + +/* + * Dummy, to avoid dragging all the test_attr infrastructure in the python + * binding. + */ +void test_attr__open(struct perf_event_attr *attr, pid_t pid, int cpu, + int fd, int group_fd, unsigned long flags) +{ +} diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c index 5906e84..252b889 100644 --- a/tools/perf/util/util.c +++ b/tools/perf/util/util.c @@ -12,6 +12,8 @@ */ unsigned int page_size; +bool test_attr__enabled; + bool perf_host = true; bool perf_guest = false; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
perf 3.8-rc build failure: undefined reference to `strlcpy'
[tmb@tmb linux-3.8-rc5]$ make -C tools/perf -s V=1 HAVE_CPLUS_DEMANGLE=1 prefix=%{_prefix} all ... /tmp/ccJEJv6m.o: In function `main': :(.text+0x14): undefined reference to `strlcpy' collect2: ld returned 1 exit status ... This did not show up in 3.7 -- Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] x86 e820: Introduce memmap=resetusablemap for kdump usage
On Tuesday, January 29, 2013 03:10:38 AM Yinghai Lu wrote: > On Mon, Jan 28, 2013 at 5:11 PM, H. Peter Anvin wrote: > >> So I guess the final patch should be: > >>- Add a new e820 type: > >> E820_KDUMP_RESERVED /* Originally usable memory where the crashed > >> kernel kernel resided > >> in */ > >> - Use Yinghai's last posted patch, but instead of: > >> + e820_update_range(0, ULLONG_MAX, E820_RAM, > >> + E820_RESERVED); > >> ... > >> + e820_remove_range(start_at, mem_size, E820_RESERVED, > >> 0); > >> do: > >> + e820_update_range(0, ULLONG_MAX, E820_RAM, > >> + E820_KDUMP_RESERVED); > >> ... > >> + e820_remove_range(start_at, mem_size, > >> E820_KDUMP_RESERVED, 0); > >> > >> - Come up with another memmap=kdump_reserve_ram memmap option name > >> or however it should get named... > >> > >> If this proposal gets accepted, I can send a tested patch... > >> > > > > Yes, this is much saner. There really shouldn't need to be an option, > > even; since the tools need to be modified anyway, just modify the actual > > memory map data structure itself. > > yes, > > kexec-tools will change that to E820_KDUMP_RESERVED (or other good name). > > We only need to update kernel to get old max_pfn by > checking E820_KDUMP_RESERVED. Wait, above proposal does not include kexec-tools mangling of the e820 table, for several reasons: - Keep the boot interface clean and pass the original table - Only one possible error source on e820 table modifications - While hpa proposed kexec-tools to pass a modified e820 table to make things easier, exactly the opposite is the case: If kexec-tools and the kernel modify the table, things are more complex and hard to understand in case of debugging where things went wrong - It's really easy to do that in the kernel. As shown above it should simply be this line to change usable areas into E820_KDUMP_RESERVED ones: e820_update_range(0, ULLONG_MAX, E820_RAM, E820_KDUMP_RESERVED); and possibly slight adjusting when the memmap=X#Y memory the kdump kernel uses is added (has to override E820_KDUMP_RESERVED areas with usable memory again) My previously posted kexec-tools patches should simply work, it's just that the memmap option name changes to: memmap=kdump_reserve_ram This is what I proposed and is IMO the best and less complex way to go. I guess I still wait another day for comments and will send something if you agree. Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/5] net: mvmdio: unmap base register address at driver removal
Dear Florian Fainelli, On Tue, 29 Jan 2013 16:24:04 +0100, Florian Fainelli wrote: > Fix the driver remove callback to unmap the base register address and > not leak this mapping after the driver has been removed. > > Signed-off-by: Florian Fainelli What about using devm_request_and_ioremap() instead, in order to get automatic unmap on error and in the ->remove() path? But maybe it won't work because this memory range is claimed both by the MDIO driver and the Ethernet driver itself. In that case, you could use devm_ioremap(). Best regards, Thomas -- Thomas Petazzoni, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/5] net: mvmdio: rename base register cookie from smireg to regs
Dear Florian Fainelli, On Tue, 29 Jan 2013 16:24:05 +0100, Florian Fainelli wrote: > This patch renames the base register cookie in the mvmdio drive from > "smireg" to "regs" since a subsequent patch is going to use an ioremap() > cookie whose size is larger than a single register of 4 bytes. No > functionnal code change introduced. > > Signed-off-by: Florian Fainelli Acked-by: Thomas Petazzoni -- Thomas Petazzoni, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/5] net: mvmdio: enhance driver to support SMI error/done interrupts
Dear Florian Fainelli, On Tue, 29 Jan 2013 16:24:06 +0100, Florian Fainelli wrote: > #define MVMDIO_SMI_DATA_SHIFT 0 > #define MVMDIO_SMI_PHY_ADDR_SHIFT 16 > @@ -36,12 +40,28 @@ > #define MVMDIO_SMI_WRITE_OPERATION 0 > #define MVMDIO_SMI_READ_VALID BIT(27) > #define MVMDIO_SMI_BUSYBIT(28) > +#define MVMDIO_ERR_INT_CAUSE0x007C > +#define MVMDIO_ERR_INT_SMI_DONE0x0010 > +#define MVMDIO_ERR_INT_MASK 0x0080 > > struct orion_mdio_dev { > struct mutex lock; > void __iomem *regs; > + /* > + * If we have access to the error interrupt pin (which is > + * somewhat misnamed as it not only reflects internal errors > + * but also reflects SMI completion), use that to wait for > + * SMI access completion instead of polling the SMI busy bit. > + */ > + int err_interrupt; > + wait_queue_head_t smi_busy_wait; > }; > > +static int orion_mdio_smi_is_done(struct orion_mdio_dev *dev) > +{ > + return !(readl(dev->regs) & MVMDIO_SMI_BUSY); > +} > + > /* Wait for the SMI unit to be ready for another operation > */ > static int orion_mdio_wait_ready(struct mii_bus *bus) > @@ -50,19 +70,30 @@ static int orion_mdio_wait_ready(struct mii_bus *bus) > int count; > u32 val; > > - count = 0; > - while (1) { > - val = readl(dev->regs); > - if (!(val & MVMDIO_SMI_BUSY)) > - break; > - > - if (count > 100) { > - dev_err(bus->parent, "Timeout: SMI busy for too > long\n"); > - return -ETIMEDOUT; > + if (dev->err_interrupt == NO_IRQ) { > + count = 0; > + while (1) { > + val = readl(dev->regs); > + if (!(val & MVMDIO_SMI_BUSY)) > + break; What about using your new orion_mdio_smi_is_done() function here? > + > + if (count > 100) { > + dev_err(bus->parent, > + "Timeout: SMI busy for too long\n"); > + return -ETIMEDOUT; > + } > + > + udelay(10); > + count++; > } > + } > > - udelay(10); > - count++; > + if (!orion_mdio_smi_is_done(dev)) { Maybe it should be in an else if block so that the waitqueue case is only considered if there is an IRQ registered? Of course practically speaking, it's OK because if there is no IRQ, we'll wait in the polling loop above, and either exit from the function on timeout, or continue on success. But it still would make the code a little bit clearer, I'd say. > static int orion_mdio_probe(struct platform_device *pdev) > { > struct device_node *np = pdev->dev.of_node; > @@ -181,6 +227,19 @@ static int orion_mdio_probe(struct platform_device *pdev) > return -ENODEV; > } > > + dev->err_interrupt = NO_IRQ; Not needed, you already do dev->err_interrupt = something() below. > + init_waitqueue_head(&dev->smi_busy_wait); > + > + dev->err_interrupt = irq_of_parse_and_map(pdev->dev.of_node, 0); > + if (dev->err_interrupt != NO_IRQ) { > + ret = devm_request_irq(&pdev->dev, dev->err_interrupt, > + orion_mdio_err_irq, > + IRQF_SHARED, pdev->name, dev); > + if (!ret) > + writel(MVMDIO_ERR_INT_SMI_DONE, > + dev->regs + MVMDIO_ERR_INT_MASK); > + } > + > mutex_init(&dev->lock); > > ret = of_mdiobus_register(bus, np); > @@ -202,6 +261,8 @@ static int orion_mdio_remove(struct platform_device *pdev) > struct mii_bus *bus = platform_get_drvdata(pdev); > struct orion_mdio_dev *dev = bus->priv; > > + writel(0, dev->regs + MVMDIO_ERR_INT_MASK); > + free_irq(dev->err_interrupt, dev); free_irq() not needed since the IRQ handler is registered with devm_request_irq(). > mdiobus_unregister(bus); > kfree(bus->irq); > mdiobus_free(bus); Thanks, Thomas -- Thomas Petazzoni, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/5] net: mvmdio: allow Device Tree and platform device to coexist
Dear Florian Fainelli, On Tue, 29 Jan 2013 16:24:07 +0100, Florian Fainelli wrote: > This patch changes the Marvell MDIO driver to be registered by using > both Device Tree and platform device methods. The driver voluntarily > does not use devm_ioremap() to share the same error path for Device Tree > and non-Device Tree cases. Not sure why you think devm_ioremap() can't be used here. Maybe I'm missing something, but could you explain? If you use devm_ioremap(), then basically you don't need to do anything in the error path regarding to the I/O mapping... since it's the whole purpose of the devm_*() stuff to automagically undo things in the error case, and in the ->remove() code. > - dev->err_interrupt = irq_of_parse_and_map(pdev->dev.of_node, 0); > + if (pdev->dev.of_node) { > + dev->regs = of_iomap(pdev->dev.of_node, 0); > + if (!dev->regs) { > + dev_err(&pdev->dev, "No SMI register address given in > DT\n"); > + ret = -ENODEV; > + goto out_free; > + } > + > + dev->err_interrupt = irq_of_parse_and_map(pdev->dev.of_node, 0); > + } else { > + r = platform_get_resource(pdev, IORESOURCE_MEM, 0); > + > + dev->regs = ioremap(r->start, resource_size(r)); > + if (!dev->regs) { > + dev_err(&pdev->dev, "No SMI register address given\n"); > + ret = -ENODEV; > + goto out_free; > + } > + > + dev->err_interrupt = platform_get_irq(pdev, 0); > + } I think you can do a devm_ioremap() and a platform_get_irq() in both cases here, and therefore keep the code common between the DT case and the !DT case. Thanks, Thomas -- Thomas Petazzoni, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] mv643xx_eth: convert to use the Marvell Orion MDIO driver
Dear Florian Fainelli, On Tue, 29 Jan 2013 16:24:08 +0100, Florian Fainelli wrote: > This patch converts the Marvell MV643XX ethernet driver to use the > Marvell Orion MDIO driver. As a result, PowerPC and ARM platforms > registering the Marvell MV643XX ethernet driver are also updated to > register a Marvell Orion MDIO driver. This driver voluntarily overlaps > with the Marvell Ethernet shared registers because it will use a subset > of this shared register (shared_base + 0x4 - shared_base + 0x84). The > Ethernet driver is also updated to look up for a PHY device using the > Orion MDIO bus driver. > > Signed-off-by: Florian Fainelli > --- > arch/arm/plat-orion/common.c | 84 +++-- In this file, there was one "MV643XX_ETH_SHARED_NAME" platform_device registered for each network interface. Why? If the driver is shared, isn't the whole idea to register it only once? In any case, one of the idea of separating the mvmdio driver from the mvneta driver in the first place is that there should be only one instance of the mvmdio device, even if there are multiple network interfaces. The reason is that from a HW point of the view, the MDIO unit is shared between the network interfaces. If you look at armada-370-xp.dtsi, there is only one mvmdio device registered, and two network interfaces (using the mvneta driver) that are registered (and actually up to four network interfaces can exist, they are added by some other .dtsi files depending on the specific SoC). So I don't think there should be one instance of the mvmdio per network interface. Also, I am wondering what's left in this MV643XX_ETH_SHARED_NAME driver once the MDIO stuff has been pulled out in a separate driver? I think the whole point of this work should be to get rid of this MV643XX_ETH_SHARED_NAME driver, no? Thanks, Thomas -- Thomas Petazzoni, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/5] mv643xx_eth: convert to use the Marvell Orion MDIO driver
Dear Florian Fainelli, On Tue, 29 Jan 2013 17:27:56 +0100, Florian Fainelli wrote: > It looks like I introduced two redundant mvmdio instances as ge01 > refers to the ge00 smi bus (the same applies to ge11 and ge10). > Thanks for spotting this. Ok, good. > If you take a closer look at mv643xx_eth you will see that the > "shared" driver still handles the mconf bus window configuration, > which is not abstracted yet. Indeed, I've seen that. But I don't understand why it's done in the mv643xx_eth_shared_probe(). The mbus window configuration registers are per-network interface, so this call to mv643xx_eth_conf_mbus_windows() could presumably be done in mv643xx_eth_probe(). At least in mvneta, we have the same registers, and we do their initialization in the driver normal (and only) ->probe() routine. > Besides that, I would rather do it step by step. Yes, agreed. But I think it would be good to have followed patches that progressively get rid of the shared driver thing, as it will help in bringing a proper DT binding in the mv643xx_eth driver. But it certainly doesn't need to be part of this specific patch. Thanks, Thomas -- Thomas Petazzoni, Free Electrons Kernel, drivers, real-time and embedded Linux development, consulting, training and support. http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpuidle: fix new C-states not functional after AC disconnect
Am Sonntag, 13. Januar 2013, 21:44:41 schrieb Daniel Lezcano: > On 01/13/2013 09:36 PM, Sedat Dilek wrote: > > 0001: Refreshed 1-2 as v3 against Linux v3.8-rc3. > > 0002: v2 of 2-2 applied cleanly after 1-2 was refreshed! > > Hi Sedat, > > for the moment, you should use only the 1/2 because 2/2 (which is an > optimization) is wrong. Hi Daniel, thanks again for this patch, this together with my patch finally fix the bug. Now I recognized that only my patch was also sent to stable (thanks Rafael), yours not. So the bug is not completely fixed in 3.4 and 3.7. Is there a reason for not sending this to stable, too? Kind regards, Thomas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel failing to boot when compressed with bzip2
Il 30/01/2013 14:58, Rob Landley ha scritto: > > On 01/20/2013 03:55:05 PM, Thomas Capricelli wrote: > >> So my guess is that there's something badly broken in the bzip2 kernel >> decompressing code.. ? There's both a regression between kernel 3.6 and >> 3.7, and a problem with gcc-4.7. > Worked For Me. I note that I built with gcc 4.2.1 and binutils 2.17 > (the last GPLv2 releases), and it's for i686 I guess that's the main point : my bug appears on AMD64, not i686, and i can trigger It with either gcc-4.6 or gcc-4.7. I'm pretty sure it still works with such an old gcc as 4.2.1. I use binutils 2.23.1 but i doubt it matters here. Again, amd64 using vanilla kernel releases (3.x.y) linux-3.6 with gcc-4.6 : ok linux-3.6 with gcc-4.7 : fail linux-3.7 with gcc-4.6 : fail linux-3.7 with gcc-4.7 : fail I've had this BUNZIP2 option for very long, so i'm sure it was working well with previous versions of both gcc and linux kernel. greetings, Thomas -- Thomas Capricelli http://www.freehackers.org/thomas/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 05/40] cpu: Restructure cpu_down code
Split out into separate functions, so we can convert it to a state machine. Signed-off-by: Thomas Gleixner --- kernel/cpu.c | 69 --- 1 file changed, 47 insertions(+), 22 deletions(-) Index: linux-2.6/kernel/cpu.c === --- linux-2.6.orig/kernel/cpu.c +++ linux-2.6/kernel/cpu.c @@ -168,6 +168,43 @@ static int cpu_notify(unsigned long val, return __cpu_notify(val, cpu, -1, NULL); } +/* Notifier wrappers for transitioning to state machine */ +static int notify_prepare(unsigned int cpu) +{ + int nr_calls = 0; + int ret; + + ret = __cpu_notify(CPU_UP_PREPARE, cpu, -1, &nr_calls); + if (ret) { + nr_calls--; + printk(KERN_WARNING "%s: attempt to bring up CPU %u failed\n", + __func__, cpu); + __cpu_notify(CPU_UP_CANCELED, cpu, nr_calls, NULL); + } + return ret; +} + +static int notify_online(unsigned int cpu) +{ + cpu_notify(CPU_ONLINE, cpu); + return 0; +} + +static int bringup_cpu(unsigned int cpu) +{ + struct task_struct *idle = idle_thread_get(cpu); + int ret; + + /* Arch-specific enabling code. */ + ret = __cpu_up(cpu, idle); + if (ret) { + cpu_notify(CPU_UP_CANCELED, cpu); + return ret; + } + BUG_ON(!cpu_online(cpu)); + return 0; +} + #ifdef CONFIG_HOTPLUG_CPU static void cpu_notify_nofail(unsigned long val, unsigned int cpu) @@ -340,7 +377,7 @@ EXPORT_SYMBOL(cpu_down); static int __cpuinit _cpu_up(unsigned int cpu, int tasks_frozen) { struct task_struct *idle; - int ret, nr_calls = 0; + int ret; cpu_hotplug_begin(); @@ -355,35 +392,23 @@ static int __cpuinit _cpu_up(unsigned in goto out; } + cpuhp_tasks_frozen = tasks_frozen; + ret = smpboot_create_threads(cpu); if (ret) goto out; - cpuhp_tasks_frozen = tasks_frozen; - - ret = __cpu_notify(CPU_UP_PREPARE, cpu, -1, &nr_calls); - if (ret) { - nr_calls--; - printk(KERN_WARNING "%s: attempt to bring up CPU %u failed\n", - __func__, cpu); - goto out_notify; - } + ret = notify_prepare(cpu); + if (ret) + goto out; - /* Arch-specific enabling code. */ - ret = __cpu_up(cpu, idle); - if (ret != 0) - goto out_notify; - BUG_ON(!cpu_online(cpu)); + ret = bringup_cpu(cpu); + if (ret) + goto out; /* Wake the per cpu threads */ smpboot_unpark_threads(cpu); - - /* Now call notifier in preparation. */ - cpu_notify(CPU_ONLINE, cpu); - -out_notify: - if (ret != 0) - __cpu_notify(CPU_UP_CANCELED, cpu, nr_calls, NULL); + notify_online(cpu); out: cpu_hotplug_done(); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 15/40] x86: perf: Convert AMD IBS to hotplug state machine
Install the callbacks via the state machine and let the core invoke the callbacks on the already online cpus. Signed-off-by: Thomas Gleixner --- arch/x86/kernel/cpu/perf_event_amd_ibs.c | 54 +++ include/linux/cpuhotplug.h |1 2 files changed, 21 insertions(+), 34 deletions(-) Index: linux-2.6/arch/x86/kernel/cpu/perf_event_amd_ibs.c === --- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_amd_ibs.c +++ linux-2.6/arch/x86/kernel/cpu/perf_event_amd_ibs.c @@ -637,13 +637,10 @@ static __init int perf_ibs_pmu_init(stru return ret; } -static __init int perf_event_ibs_init(void) +static __init void perf_event_ibs_init(void) { struct attribute **attr = ibs_op_format_attrs; - if (!ibs_caps) - return -ENODEV; /* ibs not supported by the cpu */ - perf_ibs_pmu_init(&perf_ibs_fetch, "ibs_fetch"); if (ibs_caps & IBS_CAPS_OPCNT) { @@ -654,13 +651,11 @@ static __init int perf_event_ibs_init(vo register_nmi_handler(NMI_LOCAL, perf_ibs_nmi_handler, 0, "perf_ibs"); printk(KERN_INFO "perf: AMD IBS detected (0x%08x)\n", ibs_caps); - - return 0; } #else /* defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_AMD) */ -static __init int perf_event_ibs_init(void) { return 0; } +static __init void perf_event_ibs_init(void) { } #endif @@ -827,11 +822,10 @@ static inline int get_ibs_lvt_offset(voi return val & IBSCTL_LVT_OFFSET_MASK; } -static void setup_APIC_ibs(void *dummy) +static void setup_APIC_ibs(void) { - int offset; + int offset = get_ibs_lvt_offset(); - offset = get_ibs_lvt_offset(); if (offset < 0) goto failed; @@ -842,30 +836,19 @@ failed: smp_processor_id()); } -static void clear_APIC_ibs(void *dummy) +static int __cpuinit x86_pmu_amd_ibs_starting_cpu(unsigned int cpu) { - int offset; - - offset = get_ibs_lvt_offset(); - if (offset >= 0) - setup_APIC_eilvt(offset, 0, APIC_EILVT_MSG_FIX, 1); + setup_APIC_ibs(); + return 0; } -static int __cpuinit -perf_ibs_cpu_notifier(struct notifier_block *self, unsigned long action, void *hcpu) +static int __cpuinit x86_pmu_amd_ibs_dying_cpu(unsigned int cpu) { - switch (action & ~CPU_TASKS_FROZEN) { - case CPU_STARTING: - setup_APIC_ibs(NULL); - break; - case CPU_DYING: - clear_APIC_ibs(NULL); - break; - default: - break; - } + int offset = get_ibs_lvt_offset(); - return NOTIFY_OK; + if (offset >= 0) + setup_APIC_eilvt(offset, 0, APIC_EILVT_MSG_FIX, 1); + return 0; } static __init int amd_ibs_init(void) @@ -889,15 +872,18 @@ static __init int amd_ibs_init(void) if (!ibs_eilvt_valid()) goto out; - get_online_cpus(); ibs_caps = caps; /* make ibs_caps visible to other cpus: */ smp_mb(); - perf_cpu_notifier(perf_ibs_cpu_notifier); - smp_call_function(setup_APIC_ibs, NULL, 1); - put_online_cpus(); + /* +* x86_pmu_amd_ibs_starting_cpu will be called from core on +* all online cpus. +*/ + cpuhp_setup_state(CPUHP_AP_PERF_X86_AMD_IBS_STARTING, + x86_pmu_amd_ibs_starting_cpu, + x86_pmu_amd_ibs_dying_cpu); - ret = perf_event_ibs_init(); + perf_event_ibs_init(); out: if (ret) pr_err("Failed to setup IBS, %d\n", ret); Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -14,6 +14,7 @@ enum cpuhp_states { CPUHP_AP_OFFLINE, CPUHP_AP_SCHED_STARTING, CPUHP_AP_PERF_X86_UNCORE_STARTING, + CPUHP_AP_PERF_X86_AMD_IBS_STARTING, CPUHP_AP_PERF_X86_STARTING, CPUHP_AP_NOTIFY_STARTING, CPUHP_AP_NOTIFY_DYING, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 39/40] relayfs: Convert to hotplug state machine
From: Richard Weinberger Signed-off-by: Richard Weinberger Signed-off-by: Thomas Gleixner --- include/linux/cpuhotplug.h |7 + kernel/cpu.c |4 +++ kernel/relay.c | 59 ++--- 3 files changed, 25 insertions(+), 45 deletions(-) Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -18,6 +18,7 @@ enum cpuhp_states { CPUHP_PROFILE_PREPARE, CPUHP_X2APIC_PREPARE, CPUHP_SMPCFD_PREPARE, + CPUHP_RELAY_PREPARE, CPUHP_NOTIFY_PREPARE, CPUHP_NOTIFY_DEAD, CPUHP_CLOCKEVENTS_DEAD, @@ -204,4 +205,10 @@ int profile_online_cpu(unsigned int cpu) int smpcfd_prepare_cpu(unsigned int cpu); int smpcfd_dead_cpu(unsigned int cpu); +#ifdef CONFIG_RELAY +int relay_prepare_cpu(unsigned int cpu); +#else +#define relay_prepare_cpu NULL +#endif + #endif Index: linux-2.6/kernel/cpu.c === --- linux-2.6.orig/kernel/cpu.c +++ linux-2.6/kernel/cpu.c @@ -768,6 +768,10 @@ static struct cpuhp_step cpuhp_bp_states .startup = smpcfd_prepare_cpu, .teardown = smpcfd_dead_cpu, }, + [CPUHP_RELAY_PREPARE] = { + .startup = relay_prepare_cpu, + .teardown = NULL, + }, [CPUHP_NOTIFY_PREPARE] = { .startup = notify_prepare, .teardown = NULL, Index: linux-2.6/kernel/relay.c === --- linux-2.6.orig/kernel/relay.c +++ linux-2.6/kernel/relay.c @@ -508,46 +508,24 @@ static void setup_callbacks(struct rchan chan->cb = cb; } -/** - * relay_hotcpu_callback - CPU hotplug callback - * @nb: notifier block - * @action: hotplug action to take - * @hcpu: CPU number - * - * Returns the success/failure of the operation. (%NOTIFY_OK, %NOTIFY_BAD) - */ -static int __cpuinit relay_hotcpu_callback(struct notifier_block *nb, - unsigned long action, - void *hcpu) +int __cpuinit relay_prepare_cpu(unsigned int cpu) { - unsigned int hotcpu = (unsigned long)hcpu; struct rchan *chan; - switch(action) { - case CPU_UP_PREPARE: - case CPU_UP_PREPARE_FROZEN: - mutex_lock(&relay_channels_mutex); - list_for_each_entry(chan, &relay_channels, list) { - if (chan->buf[hotcpu]) - continue; - chan->buf[hotcpu] = relay_open_buf(chan, hotcpu); - if(!chan->buf[hotcpu]) { - printk(KERN_ERR - "relay_hotcpu_callback: cpu %d buffer " - "creation failed\n", hotcpu); - mutex_unlock(&relay_channels_mutex); - return notifier_from_errno(-ENOMEM); - } - } - mutex_unlock(&relay_channels_mutex); - break; - case CPU_DEAD: - case CPU_DEAD_FROZEN: - /* No need to flush the cpu : will be flushed upon -* final relay_flush() call. */ - break; + mutex_lock(&relay_channels_mutex); + list_for_each_entry(chan, &relay_channels, list) { + if (chan->buf[cpu]) + continue; + chan->buf[cpu] = relay_open_buf(chan, cpu); + if(!chan->buf[cpu]) { + pr_err("relay: cpu %d buffer creation failed\n", cpu); + mutex_unlock(&relay_channels_mutex); + return -ENOMEM; + } } - return NOTIFY_OK; + + mutex_unlock(&relay_channels_mutex); + return 0; } /** @@ -1355,12 +1333,3 @@ const struct file_operations relay_file_ .splice_read= relay_file_splice_read, }; EXPORT_SYMBOL_GPL(relay_file_operations); - -static __init int relay_init(void) -{ - - hotcpu_notifier(relay_hotcpu_callback, 0); - return 0; -} - -early_initcall(relay_init); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 40/40] slab: Convert to hotplug state machine
From: Richard Weinberger Signed-off-by: Richard Weinberger Signed-off-by: Thomas Gleixner --- include/linux/cpuhotplug.h | 15 ++ kernel/cpu.c |8 +++ mm/slab.c | 102 ++--- 3 files changed, 64 insertions(+), 61 deletions(-) Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -19,6 +19,7 @@ enum cpuhp_states { CPUHP_X2APIC_PREPARE, CPUHP_SMPCFD_PREPARE, CPUHP_RELAY_PREPARE, + CPUHP_SLAB_PREPARE, CPUHP_NOTIFY_PREPARE, CPUHP_NOTIFY_DEAD, CPUHP_CLOCKEVENTS_DEAD, @@ -49,6 +50,7 @@ enum cpuhp_states { CPUHP_WORKQUEUE_ONLINE, CPUHP_CPUFREQ_ONLINE, CPUHP_RCUTREE_ONLINE, + CPUHP_SLAB_ONLINE, CPUHP_NOTIFY_ONLINE, CPUHP_PROFILE_ONLINE, CPUHP_NOTIFY_DOWN_PREPARE, @@ -211,4 +213,17 @@ int relay_prepare_cpu(unsigned int cpu); #define relay_prepare_cpu NULL #endif +/* slab hotplug events */ +#if defined(CONFIG_SLAB) && defined(CONFIG_SMP) +int slab_prepare_cpu(unsigned int cpu); +int slab_online_cpu(unsigned int cpu); +int slab_offline_cpu(unsigned int cpu); +int slab_dead_cpu(unsigned int cpu); +#else +#define slab_prepare_cpu NULL +#define slab_online_cpuNULL +#define slab_offline_cpu NULL +#define slab_dead_cpu NULL +#endif + #endif Index: linux-2.6/kernel/cpu.c === --- linux-2.6.orig/kernel/cpu.c +++ linux-2.6/kernel/cpu.c @@ -772,6 +772,10 @@ static struct cpuhp_step cpuhp_bp_states .startup = relay_prepare_cpu, .teardown = NULL, }, + [CPUHP_SLAB_PREPARE] = { + .startup = slab_prepare_cpu, + .teardown = slab_dead_cpu, + }, [CPUHP_NOTIFY_PREPARE] = { .startup = notify_prepare, .teardown = NULL, @@ -820,6 +824,10 @@ static struct cpuhp_step cpuhp_bp_states .startup = profile_online_cpu, .teardown = NULL, }, + [CPUHP_SLAB_ONLINE] = { + .startup = slab_online_cpu, + .teardown = slab_offline_cpu, + }, [CPUHP_NOTIFY_DOWN_PREPARE] = { .startup = NULL, .teardown = notify_down_prepare, Index: linux-2.6/mm/slab.c === --- linux-2.6.orig/mm/slab.c +++ linux-2.6/mm/slab.c @@ -1426,65 +1426,51 @@ bad: return -ENOMEM; } -static int __cpuinit cpuup_callback(struct notifier_block *nfb, - unsigned long action, void *hcpu) +int __cpuinit slab_prepare_cpu(unsigned int cpu) { - long cpu = (long)hcpu; - int err = 0; + int err; - switch (action) { - case CPU_UP_PREPARE: - case CPU_UP_PREPARE_FROZEN: - mutex_lock(&slab_mutex); - err = cpuup_prepare(cpu); - mutex_unlock(&slab_mutex); - break; - case CPU_ONLINE: - case CPU_ONLINE_FROZEN: - start_cpu_timer(cpu); - break; -#ifdef CONFIG_HOTPLUG_CPU - case CPU_DOWN_PREPARE: - case CPU_DOWN_PREPARE_FROZEN: - /* -* Shutdown cache reaper. Note that the slab_mutex is -* held so that if cache_reap() is invoked it cannot do -* anything expensive but will only modify reap_work -* and reschedule the timer. - */ - cancel_delayed_work_sync(&per_cpu(slab_reap_work, cpu)); - /* Now the cache_reaper is guaranteed to be not running. */ - per_cpu(slab_reap_work, cpu).work.func = NULL; - break; - case CPU_DOWN_FAILED: - case CPU_DOWN_FAILED_FROZEN: - start_cpu_timer(cpu); - break; - case CPU_DEAD: - case CPU_DEAD_FROZEN: - /* -* Even if all the cpus of a node are down, we don't free the -* kmem_list3 of any cache. This to avoid a race between -* cpu_down, and a kmalloc allocation from another cpu for -* memory from the node of the cpu going down. The list3 -* structure is usually allocated from kmem_cache_create() and -* gets destroyed at kmem_cache_destroy(). -*/ - /* fall through */ -#endif - case CPU_UP_CANCELED: - case CPU_UP_CANCELED_FROZEN: - mutex_lock(&slab_mutex); - cpuup_canceled(cpu); - mutex_unlock(&slab_mutex); - break; - } - return notifier_from_errno(err); + mutex_lock(&slab_mutex); + err = cpu
[patch 04/40] cpu: Restructure FROZEN state handling
There are only a few callbacks which really care about FROZEN vs. !FROZEN. No need to have extra states for this. Publish the frozen state in an extra variable which is updated under the hotplug lock and let the users interested deal with it w/o imposing that extra state checks on everyone. Signed-off-by: Thomas Gleixner --- kernel/cpu.c | 66 --- 1 file changed, 27 insertions(+), 39 deletions(-) Index: linux-2.6/kernel/cpu.c === --- linux-2.6.orig/kernel/cpu.c +++ linux-2.6/kernel/cpu.c @@ -25,6 +25,7 @@ #ifdef CONFIG_SMP /* Serializes the updates to cpu_online_mask, cpu_present_mask */ static DEFINE_MUTEX(cpu_add_remove_lock); +static bool cpuhp_tasks_frozen; /* * The following two API's must be used when attempting @@ -148,27 +149,30 @@ int __ref register_cpu_notifier(struct n return ret; } -static int __cpu_notify(unsigned long val, void *v, int nr_to_call, +static int __cpu_notify(unsigned long val, unsigned int cpu, int nr_to_call, int *nr_calls) { + unsigned long mod = cpuhp_tasks_frozen ? CPU_TASKS_FROZEN : 0; + void *hcpu = (void *)(long)cpu; + int ret; - ret = __raw_notifier_call_chain(&cpu_chain, val, v, nr_to_call, + ret = __raw_notifier_call_chain(&cpu_chain, val | mod, hcpu, nr_to_call, nr_calls); return notifier_to_errno(ret); } -static int cpu_notify(unsigned long val, void *v) +static int cpu_notify(unsigned long val, unsigned int cpu) { - return __cpu_notify(val, v, -1, NULL); + return __cpu_notify(val, cpu, -1, NULL); } #ifdef CONFIG_HOTPLUG_CPU -static void cpu_notify_nofail(unsigned long val, void *v) +static void cpu_notify_nofail(unsigned long val, unsigned int cpu) { - BUG_ON(cpu_notify(val, v)); + BUG_ON(cpu_notify(val, cpu)); } EXPORT_SYMBOL(register_cpu_notifier); @@ -237,23 +241,17 @@ static inline void check_for_tasks(int c write_unlock_irq(&tasklist_lock); } -struct take_cpu_down_param { - unsigned long mod; - void *hcpu; -}; - /* Take this CPU down. */ static int __ref take_cpu_down(void *_param) { - struct take_cpu_down_param *param = _param; - int err; + int err, cpu = smp_processor_id(); /* Ensure this CPU doesn't handle any more interrupts. */ err = __cpu_disable(); if (err < 0) return err; - cpu_notify(CPU_DYING | param->mod, param->hcpu); + cpu_notify(CPU_DYING, cpu); /* Park the stopper thread */ kthread_park(current); return 0; @@ -263,12 +261,6 @@ static int __ref take_cpu_down(void *_pa static int __ref _cpu_down(unsigned int cpu, int tasks_frozen) { int err, nr_calls = 0; - void *hcpu = (void *)(long)cpu; - unsigned long mod = tasks_frozen ? CPU_TASKS_FROZEN : 0; - struct take_cpu_down_param tcd_param = { - .mod = mod, - .hcpu = hcpu, - }; if (num_online_cpus() == 1) return -EBUSY; @@ -278,21 +270,23 @@ static int __ref _cpu_down(unsigned int cpu_hotplug_begin(); - err = __cpu_notify(CPU_DOWN_PREPARE | mod, hcpu, -1, &nr_calls); + cpuhp_tasks_frozen = tasks_frozen; + + err = __cpu_notify(CPU_DOWN_PREPARE, cpu, -1, &nr_calls); if (err) { nr_calls--; - __cpu_notify(CPU_DOWN_FAILED | mod, hcpu, nr_calls, NULL); + __cpu_notify(CPU_DOWN_FAILED, cpu, nr_calls, NULL); printk("%s: attempt to take down CPU %u failed\n", __func__, cpu); goto out_release; } smpboot_park_threads(cpu); - err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu)); + err = __stop_machine(take_cpu_down, NULL, cpumask_of(cpu)); if (err) { /* CPU didn't die: tell everyone. Can't complain. */ smpboot_unpark_threads(cpu); - cpu_notify_nofail(CPU_DOWN_FAILED | mod, hcpu); + cpu_notify_nofail(CPU_DOWN_FAILED, cpu); goto out_release; } BUG_ON(cpu_online(cpu)); @@ -311,14 +305,14 @@ static int __ref _cpu_down(unsigned int __cpu_die(cpu); /* CPU is completely dead: tell everyone. Too late to complain. */ - cpu_notify_nofail(CPU_DEAD | mod, hcpu); + cpu_notify_nofail(CPU_DEAD, cpu); check_for_tasks(cpu); out_release: cpu_hotplug_done(); if (!err) - cpu_notify_nofail(CPU_POST_DEAD | mod, hcpu); + cpu_notify_nofail(CPU_POST_DEAD, cpu); return err; } @@ -345,10 +339,8 @@ EXPORT_SYMBOL(cpu_down); /* Requires cpu_add_remove_lock to be held */ static int __cpuinit _cpu_up(u
[patch 24/40] arm64: Convert generic timers to hotplug state machine
Straight forward replacement. Signed-off-by: Thomas Gleixner --- drivers/clocksource/arm_generic.c | 40 +++--- include/linux/cpuhotplug.h|1 2 files changed, 13 insertions(+), 28 deletions(-) Index: linux-2.6/drivers/clocksource/arm_generic.c === --- linux-2.6.orig/drivers/clocksource/arm_generic.c +++ linux-2.6/drivers/clocksource/arm_generic.c @@ -91,8 +91,10 @@ static int arch_timer_set_next_event(uns return 0; } -static void __cpuinit arch_timer_setup(struct clock_event_device *clk) +static int __cpuinit arch_timer_cpu_starting(unsigned int cpu) { + struct clock_event_device *clk = per_cpu_ptr(&arch_timer_evt, cpu); + /* Let's make sure the timer is off before doing anything else */ arch_timer_stop(); @@ -157,34 +159,17 @@ unsigned long long notrace sched_clock(v return arch_counter_get_cntvct() * sched_clock_mult; } -static int __cpuinit arch_timer_cpu_notify(struct notifier_block *self, - unsigned long action, void *hcpu) + +static int __cpuinit arch_timer_dying_cpu(unsigned int cpu) { - int cpu = (long)hcpu; struct clock_event_device *clk = per_cpu_ptr(&arch_timer_evt, cpu); - switch(action) { - case CPU_STARTING: - case CPU_STARTING_FROZEN: - arch_timer_setup(clk); - break; - - case CPU_DYING: - case CPU_DYING_FROZEN: - pr_debug("arch_timer_teardown disable IRQ%d cpu #%d\n", -clk->irq, cpu); - disable_percpu_irq(clk->irq); - arch_timer_set_mode(CLOCK_EVT_MODE_UNUSED, clk); - break; - } - - return NOTIFY_OK; + pr_debug("arch_timer_teardown disable IRQ%d cpu #%d\n", clk->irq, cpu); + disable_percpu_irq(clk->irq); + arch_timer_set_mode(CLOCK_EVT_MODE_UNUSED, clk); + return 0; } -static struct notifier_block __cpuinitdata arch_timer_cpu_nb = { - .notifier_call = arch_timer_cpu_notify, -}; - static const struct of_device_id arch_timer_of_match[] __initconst = { { .compatible = "arm,armv8-timer" }, {}, @@ -223,10 +208,9 @@ int __init arm_generic_timer_init(void) /* Calibrate the delay loop directly */ lpj_fine = DIV_ROUND_CLOSEST(arch_timer_rate, HZ); - /* Immediately configure the timer on the boot CPU */ - arch_timer_setup(this_cpu_ptr(&arch_timer_evt)); - - register_cpu_notifier(&arch_timer_cpu_nb); + /* Register and immediately configure the timer on the boot CPU */ + return cpuhp_setup_state(CPUHP_AP_ARM64_TIMER_STARTING, +arch_timer_starting_cpu, arch_timer_dying_cpu); return 0; } Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -22,6 +22,7 @@ enum cpuhp_states { CPUHP_AP_PERF_X86_UNCORE_STARTING, CPUHP_AP_PERF_X86_AMD_IBS_STARTING, CPUHP_AP_PERF_X86_STARTING, + CPUHP_AP_ARM64_TIMER_STARTING, CPUHP_AP_NOTIFY_STARTING, CPUHP_AP_NOTIFY_DYING, CPUHP_AP_SCHED_MIGRATE_DYING, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 23/40] cpufreq: Convert to hotplug state machine
Straight forward conversion to state machine callbacks w/o fixing the obvious brokeness of the asymetric state invocations. Signed-off-by: Thomas Gleixner --- drivers/cpufreq/cpufreq_stats.c | 55 +--- include/linux/cpuhotplug.h |2 + 2 files changed, 15 insertions(+), 42 deletions(-) Index: linux-2.6/drivers/cpufreq/cpufreq_stats.c === --- linux-2.6.orig/drivers/cpufreq/cpufreq_stats.c +++ linux-2.6/drivers/cpufreq/cpufreq_stats.c @@ -167,7 +167,7 @@ static int freq_table_get_index(struct c /* should be called late in the CPU removal sequence so that the stats * memory is still available in case someone tries to use it. */ -static void cpufreq_stats_free_table(unsigned int cpu) +static int cpufreq_stats_free_table(unsigned int cpu) { struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, cpu); if (stat) { @@ -175,18 +175,20 @@ static void cpufreq_stats_free_table(uns kfree(stat); } per_cpu(cpufreq_stats_table, cpu) = NULL; + return 0; } /* must be called early in the CPU removal sequence (before * cpufreq_remove_dev) so that policy is still valid. */ -static void cpufreq_stats_free_sysfs(unsigned int cpu) +static int cpufreq_stats_free_sysfs(unsigned int cpu) { struct cpufreq_policy *policy = cpufreq_cpu_get(cpu); if (policy && policy->cpu == cpu) sysfs_remove_group(&policy->kobj, &stats_attr_group); if (policy) cpufreq_cpu_put(policy); + return 0; } static int cpufreq_stats_create_table(struct cpufreq_policy *policy, @@ -316,35 +318,6 @@ static int cpufreq_stat_notifier_trans(s return 0; } -static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb, - unsigned long action, - void *hcpu) -{ - unsigned int cpu = (unsigned long)hcpu; - - switch (action) { - case CPU_ONLINE: - case CPU_ONLINE_FROZEN: - cpufreq_update_policy(cpu); - break; - case CPU_DOWN_PREPARE: - case CPU_DOWN_PREPARE_FROZEN: - cpufreq_stats_free_sysfs(cpu); - break; - case CPU_DEAD: - case CPU_DEAD_FROZEN: - cpufreq_stats_free_table(cpu); - break; - } - return NOTIFY_OK; -} - -/* priority=1 so this will get called before cpufreq_remove_dev */ -static struct notifier_block cpufreq_stat_cpu_notifier __refdata = { - .notifier_call = cpufreq_stat_cpu_callback, - .priority = 1, -}; - static struct notifier_block notifier_policy_block = { .notifier_call = cpufreq_stat_notifier_policy }; @@ -364,18 +337,19 @@ static int __init cpufreq_stats_init(voi if (ret) return ret; - register_hotcpu_notifier(&cpufreq_stat_cpu_notifier); - for_each_online_cpu(cpu) - cpufreq_update_policy(cpu); + /* Install callbacks. Core will call them for each online cpu */ + cpuhp_setup_state(CPUHP_CPUFREQ_DEAD, NULL, cpufreq_stats_free_table); + /* CHECKME: This is pretty broken versus failures in up/down! */ + cpuhp_setup_state(CPUHP_CPUFREQ_ONLINE, cpufreq_update_policy, + cpufreq_stats_free_sysfs); ret = cpufreq_register_notifier(¬ifier_trans_block, CPUFREQ_TRANSITION_NOTIFIER); if (ret) { cpufreq_unregister_notifier(¬ifier_policy_block, CPUFREQ_POLICY_NOTIFIER); - unregister_hotcpu_notifier(&cpufreq_stat_cpu_notifier); - for_each_online_cpu(cpu) - cpufreq_stats_free_table(cpu); + cpuhp_uninstall_callbacks(cpufreq_stats_cbs, + ARRAY_SIZE(cpufreq_stats_cbs)); return ret; } @@ -389,11 +363,8 @@ static void __exit cpufreq_stats_exit(vo CPUFREQ_POLICY_NOTIFIER); cpufreq_unregister_notifier(¬ifier_trans_block, CPUFREQ_TRANSITION_NOTIFIER); - unregister_hotcpu_notifier(&cpufreq_stat_cpu_notifier); - for_each_online_cpu(cpu) { - cpufreq_stats_free_table(cpu); - cpufreq_stats_free_sysfs(cpu); - } + cpuhp_uninstall_callbacks(cpufreq_stats_cbs, + ARRAY_SIZE(cpufreq_stats_cbs)); } MODULE_AUTHOR("Zou Nan hai "); Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -14,6 +14,7 @@ enum cpuhp_states { CPUHP_WORKQUEUE_PREP, CPUHP_NOTIFY_PREPARE, CPUHP_NOTIFY_DEAD, +
[patch 27/40] virt: Convert kvm hotplug to state machine
Signed-off-by: Thomas Gleixner --- include/linux/cpuhotplug.h |1 + virt/kvm/kvm_main.c| 42 -- 2 files changed, 17 insertions(+), 26 deletions(-) Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -25,6 +25,7 @@ enum cpuhp_states { CPUHP_AP_PERF_ARM_STARTING, CPUHP_AP_ARM_VFP_STARTING, CPUHP_AP_ARM64_TIMER_STARTING, + CPUHP_AP_KVM_STARTING, CPUHP_AP_NOTIFY_STARTING, CPUHP_AP_NOTIFY_DYING, CPUHP_AP_SCHED_MIGRATE_DYING, Index: linux-2.6/virt/kvm/kvm_main.c === --- linux-2.6.orig/virt/kvm/kvm_main.c +++ linux-2.6/virt/kvm/kvm_main.c @@ -2496,30 +2496,23 @@ static int hardware_enable_all(void) return r; } -static int kvm_cpu_hotplug(struct notifier_block *notifier, unsigned long val, - void *v) +static int kvm_starting_cpu(unsigned int cpu) { - int cpu = (long)v; - - if (!kvm_usage_count) - return NOTIFY_OK; - - val &= ~CPU_TASKS_FROZEN; - switch (val) { - case CPU_DYING: - printk(KERN_INFO "kvm: disabling virtualization on CPU%d\n", - cpu); - hardware_disable(NULL); - break; - case CPU_STARTING: - printk(KERN_INFO "kvm: enabling virtualization on CPU%d\n", - cpu); + if (kvm_usage_count) { + pr_info("kvm: enabling virtualization on CPU%u\n", cpu); hardware_enable(NULL); - break; } - return NOTIFY_OK; + return 0; } +static int kvm_dying_cpu(unsigned int cpu) +{ + if (kvm_usage_count) { + pr_info("kvm: disabling virtualization on CPU%u\n", cpu); + hardware_disable(NULL); + } + return 0; +} asmlinkage void kvm_spurious_fault(void) { @@ -2725,10 +2718,6 @@ int kvm_io_bus_unregister_dev(struct kvm return r; } -static struct notifier_block kvm_cpu_notifier = { - .notifier_call = kvm_cpu_hotplug, -}; - static int vm_stat_get(void *_offset, u64 *val) { unsigned offset = (long)_offset; @@ -2870,7 +2859,8 @@ int kvm_init(void *opaque, unsigned vcpu goto out_free_1; } - r = register_cpu_notifier(&kvm_cpu_notifier); + r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_STARTING, kvm_starting_cpu, + kvm_dying_cpu); if (r) goto out_free_2; register_reboot_notifier(&kvm_reboot_notifier); @@ -2920,7 +2910,7 @@ out_free: kmem_cache_destroy(kvm_vcpu_cache); out_free_3: unregister_reboot_notifier(&kvm_reboot_notifier); - unregister_cpu_notifier(&kvm_cpu_notifier); + cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING); out_free_2: out_free_1: kvm_arch_hardware_unsetup(); @@ -2941,7 +2931,7 @@ void kvm_exit(void) kvm_async_pf_deinit(); unregister_syscore_ops(&kvm_syscore_ops); unregister_reboot_notifier(&kvm_reboot_notifier); - unregister_cpu_notifier(&kvm_cpu_notifier); + cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING); on_each_cpu(hardware_disable_nolock, NULL, 1); kvm_arch_hardware_unsetup(); kvm_arch_exit(); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 30/40] x86: tboot: Convert to hotplug state machine
Signed-off-by: Thomas Gleixner --- arch/x86/kernel/tboot.c| 23 +++ include/linux/cpuhotplug.h |1 + 2 files changed, 8 insertions(+), 16 deletions(-) Index: linux-2.6/arch/x86/kernel/tboot.c === --- linux-2.6.orig/arch/x86/kernel/tboot.c +++ linux-2.6/arch/x86/kernel/tboot.c @@ -319,25 +319,16 @@ static int tboot_wait_for_aps(int num_ap return !(atomic_read((atomic_t *)&tboot->num_in_wfs) == num_aps); } -static int __cpuinit tboot_cpu_callback(struct notifier_block *nfb, - unsigned long action, void *hcpu) +static int __cpuinit tboot_dying_cpu(unsigned int cpu) { - switch (action) { - case CPU_DYING: - atomic_inc(&ap_wfs_count); - if (num_online_cpus() == 1) - if (tboot_wait_for_aps(atomic_read(&ap_wfs_count))) - return NOTIFY_BAD; - break; + atomic_inc(&ap_wfs_count); + if (num_online_cpus() == 1) { + if (tboot_wait_for_aps(atomic_read(&ap_wfs_count))) + return -EBUSY; } - return NOTIFY_OK; + return 0; } -static struct notifier_block tboot_cpu_notifier __cpuinitdata = -{ - .notifier_call = tboot_cpu_callback, -}; - static __init int tboot_late_init(void) { if (!tboot_enabled()) @@ -346,7 +337,7 @@ static __init int tboot_late_init(void) tboot_create_trampoline(); atomic_set(&ap_wfs_count, 0); - register_hotcpu_notifier(&tboot_cpu_notifier); + cpuhp_setup_state(CPUHP_AP_X86_TBOOT_DYING, NULL, tboot_dying_cpu); acpi_os_set_prepare_sleep(&tboot_sleep); return 0; Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -27,6 +27,7 @@ enum cpuhp_states { CPUHP_AP_ARM64_TIMER_STARTING, CPUHP_AP_KVM_STARTING, CPUHP_AP_NOTIFY_DYING, + CPUHP_AP_X86_TBOOT_DYING, CPUHP_AP_S390_VTIME_DYING, CPUHP_AP_SCHED_MIGRATE_DYING, CPUHP_AP_MAX, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 38/40] smp: Convert core to hotplug state machine
From: Richard Weinberger Signed-off-by: Richard Weinberger Signed-off-by: Thomas Gleixner --- include/linux/cpuhotplug.h |5 kernel/cpu.c |4 +++ kernel/smp.c | 50 - 3 files changed, 27 insertions(+), 32 deletions(-) Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -17,6 +17,7 @@ enum cpuhp_states { CPUHP_TIMERS_PREPARE, CPUHP_PROFILE_PREPARE, CPUHP_X2APIC_PREPARE, + CPUHP_SMPCFD_PREPARE, CPUHP_NOTIFY_PREPARE, CPUHP_NOTIFY_DEAD, CPUHP_CLOCKEVENTS_DEAD, @@ -199,4 +200,8 @@ int profile_online_cpu(unsigned int cpu) #define profile_online_cpu NULL #endif +/* SMP core functions */ +int smpcfd_prepare_cpu(unsigned int cpu); +int smpcfd_dead_cpu(unsigned int cpu); + #endif Index: linux-2.6/kernel/cpu.c === --- linux-2.6.orig/kernel/cpu.c +++ linux-2.6/kernel/cpu.c @@ -764,6 +764,10 @@ static struct cpuhp_step cpuhp_bp_states .startup = profile_prepare_cpu, .teardown = profile_dead_cpu, }, + [CPUHP_SMPCFD_PREPARE] = { + .startup = smpcfd_prepare_cpu, + .teardown = smpcfd_dead_cpu, + }, [CPUHP_NOTIFY_PREPARE] = { .startup = notify_prepare, .teardown = NULL, Index: linux-2.6/kernel/smp.c === --- linux-2.6.orig/kernel/smp.c +++ linux-2.6/kernel/smp.c @@ -45,45 +45,32 @@ struct call_single_queue { static DEFINE_PER_CPU_SHARED_ALIGNED(struct call_single_queue, call_single_queue); -static int -hotplug_cfd(struct notifier_block *nfb, unsigned long action, void *hcpu) +int __cpuinit smpcfd_prepare_cpu(unsigned int cpu) { - long cpu = (long)hcpu; struct call_function_data *cfd = &per_cpu(cfd_data, cpu); - switch (action) { - case CPU_UP_PREPARE: - case CPU_UP_PREPARE_FROZEN: - if (!zalloc_cpumask_var_node(&cfd->cpumask, GFP_KERNEL, - cpu_to_node(cpu))) - return notifier_from_errno(-ENOMEM); - if (!zalloc_cpumask_var_node(&cfd->cpumask_ipi, GFP_KERNEL, - cpu_to_node(cpu))) - return notifier_from_errno(-ENOMEM); - break; - -#ifdef CONFIG_HOTPLUG_CPU - case CPU_UP_CANCELED: - case CPU_UP_CANCELED_FROZEN: - - case CPU_DEAD: - case CPU_DEAD_FROZEN: + if (!zalloc_cpumask_var_node(&cfd->cpumask, GFP_KERNEL, +cpu_to_node(cpu))) + return -ENOMEM; + if (!zalloc_cpumask_var_node(&cfd->cpumask_ipi, GFP_KERNEL, +cpu_to_node(cpu))) { free_cpumask_var(cfd->cpumask); - free_cpumask_var(cfd->cpumask_ipi); - break; -#endif - }; - - return NOTIFY_OK; + return -ENOMEM; + } + return; } -static struct notifier_block __cpuinitdata hotplug_cfd_notifier = { - .notifier_call = hotplug_cfd, -}; +int __cpuinit smpcfd_dead_cpu(unsigned int cpu) +{ + struct call_function_data *cfd = &per_cpu(cfd_data, cpu); + + free_cpumask_var(cfd->cpumask); + free_cpumask_var(cfd->cpumask_ipi); + return 0; +} void __init call_function_init(void) { - void *cpu = (void *)(long)smp_processor_id(); int i; for_each_possible_cpu(i) { @@ -93,8 +80,7 @@ void __init call_function_init(void) INIT_LIST_HEAD(&q->list); } - hotplug_cfd(&hotplug_cfd_notifier, CPU_UP_PREPARE, cpu); - register_cpu_notifier(&hotplug_cfd_notifier); + smpcfd_prepare_cpu(smp_processor_id()); } /* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 01/40] smpboot: Allow selfparking per cpu threads
The stop machine threads are still killed when a cpu goes offline. The reason is that the thread is used to bring the cpu down, so it can't be parked along with the other per cpu threads. Allow a per cpu thread to be excluded from automatic parking, so it can park itself once it's done Add a create callback function as well. Signed-off-by: Thomas Gleixner --- include/linux/smpboot.h |5 + kernel/smpboot.c|5 +++-- 2 files changed, 8 insertions(+), 2 deletions(-) Index: linux-2.6/include/linux/smpboot.h === --- linux-2.6.orig/include/linux/smpboot.h +++ linux-2.6/include/linux/smpboot.h @@ -14,6 +14,8 @@ struct smpboot_thread_data; * @thread_should_run: Check whether the thread should run or not. Called with * preemption disabled. * @thread_fn: The associated thread function + * @create:Optional setup function, called when the thread gets + * created (Not called from the thread context) * @setup: Optional setup function, called when the thread gets * operational the first time * @cleanup: Optional cleanup function, called when the thread @@ -22,6 +24,7 @@ struct smpboot_thread_data; * parked (cpu offline) * @unpark:Optional unpark function, called when the thread is * unparked (cpu online) + * @selfparking: Thread is not parked by the park function. * @thread_comm: The base name of the thread */ struct smp_hotplug_thread { @@ -29,10 +32,12 @@ struct smp_hotplug_thread { struct list_headlist; int (*thread_should_run)(unsigned int cpu); void(*thread_fn)(unsigned int cpu); + void(*create)(unsigned int cpu); void(*setup)(unsigned int cpu); void(*cleanup)(unsigned int cpu, bool online); void(*park)(unsigned int cpu); void(*unpark)(unsigned int cpu); + boolselfparking; const char *thread_comm; }; Index: linux-2.6/kernel/smpboot.c === --- linux-2.6.orig/kernel/smpboot.c +++ linux-2.6/kernel/smpboot.c @@ -183,9 +183,10 @@ __smpboot_create_thread(struct smp_hotpl kfree(td); return PTR_ERR(tsk); } - get_task_struct(tsk); *per_cpu_ptr(ht->store, cpu) = tsk; + if (ht->create) + ht->create(cpu); return 0; } @@ -225,7 +226,7 @@ static void smpboot_park_thread(struct s { struct task_struct *tsk = *per_cpu_ptr(ht->store, cpu); - if (tsk) + if (tsk && !ht->selfparking) kthread_park(tsk); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 14/40] x86: perf: Convert the core to the hotplug state machine
Replace the perf_notifier() install mechanism, which invokes magically the callback on the current cpu. Convert the hardware specific callbacks which are invoked from the x86 perf core to return proper error codes instead of totally pointless NOTIFY_BAD return values. Signed-off-by: Thomas Gleixner --- arch/x86/kernel/cpu/perf_event.c | 78 ++--- arch/x86/kernel/cpu/perf_event_amd.c |6 +- arch/x86/kernel/cpu/perf_event_intel.c |6 +- include/linux/cpuhotplug.h |3 + 4 files changed, 52 insertions(+), 41 deletions(-) Index: linux-2.6/arch/x86/kernel/cpu/perf_event.c === --- linux-2.6.orig/arch/x86/kernel/cpu/perf_event.c +++ linux-2.6/arch/x86/kernel/cpu/perf_event.c @@ -1252,47 +1252,45 @@ perf_event_nmi_handler(unsigned int cmd, struct event_constraint emptyconstraint; struct event_constraint unconstrained; -static int __cpuinit -x86_pmu_notifier(struct notifier_block *self, unsigned long action, void *hcpu) +static int __cpuinit x86_pmu_prepare_cpu(unsigned int cpu) { - unsigned int cpu = (long)hcpu; struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu); - int ret = NOTIFY_OK; - switch (action & ~CPU_TASKS_FROZEN) { - case CPU_UP_PREPARE: - cpuc->kfree_on_online = NULL; - if (x86_pmu.cpu_prepare) - ret = x86_pmu.cpu_prepare(cpu); - break; - - case CPU_STARTING: - if (x86_pmu.attr_rdpmc) - set_in_cr4(X86_CR4_PCE); - if (x86_pmu.cpu_starting) - x86_pmu.cpu_starting(cpu); - break; + cpuc->kfree_on_online = NULL; + if (x86_pmu.cpu_prepare) + return x86_pmu.cpu_prepare(cpu); + return 0; +} - case CPU_ONLINE: - kfree(cpuc->kfree_on_online); - break; +static int __cpuinit x86_pmu_dead_cpu(unsigned int cpu) +{ + if (x86_pmu.cpu_dead) + x86_pmu.cpu_dead(cpu); + return 0; +} - case CPU_DYING: - if (x86_pmu.cpu_dying) - x86_pmu.cpu_dying(cpu); - break; +static int __cpuinit x86_pmu_online_cpu(unsigned int cpu) +{ + struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu); - case CPU_UP_CANCELED: - case CPU_DEAD: - if (x86_pmu.cpu_dead) - x86_pmu.cpu_dead(cpu); - break; + kfree(cpuc->kfree_on_online); + return 0; +} - default: - break; - } +static int __cpuinit x86_pmu_starting_cpu(unsigned int cpu) +{ + if (x86_pmu.attr_rdpmc) + set_in_cr4(X86_CR4_PCE); + if (x86_pmu.cpu_starting) + x86_pmu.cpu_starting(cpu); + return 0; +} - return ret; +static int __cpuinit x86_pmu_dying_cpu(unsigned int cpu) +{ + if (x86_pmu.cpu_dying) + x86_pmu.cpu_dying(cpu); + return 0; } static void __init pmu_check_apic(void) @@ -1485,8 +1483,18 @@ static int __init init_hw_perf_events(vo pr_info("... event mask: %016Lx\n", x86_pmu.intel_ctrl); perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW); - perf_cpu_notifier(x86_pmu_notifier); - + /* +* Install callbacks. Core will call them for each online +* cpu. +* +* FIXME: This should check the return value, but the original +* code did not do that either +*/ + cpuhp_setup_state(CPUHP_PERF_X86_PREPARE, x86_pmu_prepare_cpu, + x86_pmu_dead_cpu); + cpuhp_setup_state(CPUHP_AP_PERF_X86_STARTING, x86_pmu_starting_cpu, + x86_pmu_dying_cpu); + cpuhp_setup_state(CPUHP_PERF_X86_ONLINE, x86_pmu_online_cpu, NULL); return 0; } early_initcall(init_hw_perf_events); Index: linux-2.6/arch/x86/kernel/cpu/perf_event_amd.c === --- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_amd.c +++ linux-2.6/arch/x86/kernel/cpu/perf_event_amd.c @@ -349,13 +349,13 @@ static int amd_pmu_cpu_prepare(int cpu) WARN_ON_ONCE(cpuc->amd_nb); if (boot_cpu_data.x86_max_cores < 2) - return NOTIFY_OK; + return 0; cpuc->amd_nb = amd_alloc_nb(cpu); if (!cpuc->amd_nb) - return NOTIFY_BAD; + return -ENOMEM; - return NOTIFY_OK; + return 0; } static void amd_pmu_cpu_starting(int cpu) Index: linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c === --- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_intel.c +++ linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c @@ -1662,13 +1662,13 @@ static int intel_pmu_cpu_prepare(int cpu
[patch 36/40] profile: Convert ot hotplug state machine
From: Richard Weinberger Signed-off-by: Richard Weinberger --- include/linux/cpuhotplug.h | 12 + kernel/cpu.c |8 +++ kernel/profile.c | 92 + 3 files changed, 63 insertions(+), 49 deletions(-) Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -15,6 +15,7 @@ enum cpuhp_states { CPUHP_RCUTREE_PREPARE, CPUHP_HRTIMERS_PREPARE, CPUHP_TIMERS_PREPARE, + CPUHP_PROFILE_PREPARE, CPUHP_NOTIFY_PREPARE, CPUHP_NOTIFY_DEAD, CPUHP_CLOCKEVENTS_DEAD, @@ -46,6 +47,7 @@ enum cpuhp_states { CPUHP_CPUFREQ_ONLINE, CPUHP_RCUTREE_ONLINE, CPUHP_NOTIFY_ONLINE, + CPUHP_PROFILE_ONLINE, CPUHP_NOTIFY_DOWN_PREPARE, CPUHP_PERF_X86_UNCORE_ONLINE, CPUHP_PERF_X86_ONLINE, @@ -186,4 +188,14 @@ int timers_dead_cpu(unsigned int cpu); #define timers_dead_cpuNULL #endif +#if defined(CONFIG_PROFILING) && defined(CONFIG_HOTPLUG_CPU) +int profile_prepare_cpu(unsigned int cpu); +int profile_dead_cpu(unsigned int cpu); +int profile_online_cpu(unsigned int cpu); +#else +#define profile_prepare_cpuNULL +#define profile_dead_cpu NULL +#define profile_online_cpu NULL +#endif + #endif Index: linux-2.6/kernel/cpu.c === --- linux-2.6.orig/kernel/cpu.c +++ linux-2.6/kernel/cpu.c @@ -760,6 +760,10 @@ static struct cpuhp_step cpuhp_bp_states .startup = timers_prepare_cpu, .teardown = timers_dead_cpu, }, + [CPUHP_PROFILE_PREPARE] = { + .startup = profile_prepare_cpu, + .teardown = profile_dead_cpu, + }, [CPUHP_NOTIFY_PREPARE] = { .startup = notify_prepare, .teardown = NULL, @@ -804,6 +808,10 @@ static struct cpuhp_step cpuhp_bp_states .startup = notify_online, .teardown = NULL, }, + [CPUHP_PROFILE_ONLINE] = { + .startup = profile_online_cpu, + .teardown = NULL, + }, [CPUHP_NOTIFY_DOWN_PREPARE] = { .startup = NULL, .teardown = notify_down_prepare, Index: linux-2.6/kernel/profile.c === --- linux-2.6.orig/kernel/profile.c +++ linux-2.6/kernel/profile.c @@ -353,68 +353,63 @@ out: put_cpu(); } -static int __cpuinit profile_cpu_callback(struct notifier_block *info, - unsigned long action, void *__cpu) +int __cpuinit profile_dead_cpu(unsigned int cpu) { - int node, cpu = (unsigned long)__cpu; struct page *page; + int i; - switch (action) { - case CPU_UP_PREPARE: - case CPU_UP_PREPARE_FROZEN: - node = cpu_to_mem(cpu); - per_cpu(cpu_profile_flip, cpu) = 0; - if (!per_cpu(cpu_profile_hits, cpu)[1]) { - page = alloc_pages_exact_node(node, - GFP_KERNEL | __GFP_ZERO, - 0); - if (!page) - return notifier_from_errno(-ENOMEM); - per_cpu(cpu_profile_hits, cpu)[1] = page_address(page); - } - if (!per_cpu(cpu_profile_hits, cpu)[0]) { - page = alloc_pages_exact_node(node, - GFP_KERNEL | __GFP_ZERO, - 0); - if (!page) - goto out_free; - per_cpu(cpu_profile_hits, cpu)[0] = page_address(page); - } - break; -out_free: - page = virt_to_page(per_cpu(cpu_profile_hits, cpu)[1]); - per_cpu(cpu_profile_hits, cpu)[1] = NULL; - __free_page(page); - return notifier_from_errno(-ENOMEM); - case CPU_ONLINE: - case CPU_ONLINE_FROZEN: - if (prof_cpu_mask != NULL) - cpumask_set_cpu(cpu, prof_cpu_mask); - break; - case CPU_UP_CANCELED: - case CPU_UP_CANCELED_FROZEN: - case CPU_DEAD: - case CPU_DEAD_FROZEN: - if (prof_cpu_mask != NULL) - cpumask_clear_cpu(cpu, prof_cpu_mask); - if (per_cpu(cpu_profile_hits, cpu)[0]) { - page = virt_to_page(per_cpu(cpu_profile_hits, cpu)[0]); - per_cpu(cpu_profile_hits, cpu)[0] = NULL; + if (prof_cpu_mask != NULL) + cpumask_clear_cpu(cpu, prof_cpu_mask); + + for (i = 0; i < 2; i++) { + if (per_cpu(cpu_profile_hits, cpu)[i]) { +
[patch 37/40] x86: x2apic: Convert to cpu hotplug state machine
From: Richard Weinberger Signed-off-by: Richard Weinberger Signed-off-by: Thomas Gleixner --- arch/x86/kernel/apic/x2apic_cluster.c | 80 -- include/linux/cpuhotplug.h|1 2 files changed, 31 insertions(+), 50 deletions(-) Index: linux-2.6/arch/x86/kernel/apic/x2apic_cluster.c === --- linux-2.6.orig/arch/x86/kernel/apic/x2apic_cluster.c +++ linux-2.6/arch/x86/kernel/apic/x2apic_cluster.c @@ -145,68 +145,48 @@ static void init_x2apic_ldr(void) } } - /* - * At CPU state changes, update the x2apic cluster sibling info. - */ -static int __cpuinit -update_clusterinfo(struct notifier_block *nfb, unsigned long action, void *hcpu) +/* + * At CPU state changes, update the x2apic cluster sibling info. + */ +int __cpuinit x2apic_prepare_cpu(unsigned int cpu) { - unsigned int this_cpu = (unsigned long)hcpu; - unsigned int cpu; - int err = 0; - - switch (action) { - case CPU_UP_PREPARE: - if (!zalloc_cpumask_var(&per_cpu(cpus_in_cluster, this_cpu), - GFP_KERNEL)) { - err = -ENOMEM; - } else if (!zalloc_cpumask_var(&per_cpu(ipi_mask, this_cpu), - GFP_KERNEL)) { - free_cpumask_var(per_cpu(cpus_in_cluster, this_cpu)); - err = -ENOMEM; - } - break; - case CPU_UP_CANCELED: - case CPU_UP_CANCELED_FROZEN: - case CPU_DEAD: - for_each_online_cpu(cpu) { - if (x2apic_cluster(this_cpu) != x2apic_cluster(cpu)) - continue; - __cpu_clear(this_cpu, per_cpu(cpus_in_cluster, cpu)); - __cpu_clear(cpu, per_cpu(cpus_in_cluster, this_cpu)); - } - free_cpumask_var(per_cpu(cpus_in_cluster, this_cpu)); - free_cpumask_var(per_cpu(ipi_mask, this_cpu)); - break; + if (!zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL)) + return -ENOMEM; + + if (!zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL)) { + free_cpumask_var(per_cpu(cpus_in_cluster, cpu)); + return -ENOMEM; } - return notifier_from_errno(err); + return 0; } -static struct notifier_block __refdata x2apic_cpu_notifier = { - .notifier_call = update_clusterinfo, -}; - -static int x2apic_init_cpu_notifier(void) +int __cpuinit x2apic_dead_cpu(unsigned int this_cpu) { - int cpu = smp_processor_id(); - - zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL); - zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL); + int cpu; - BUG_ON(!per_cpu(cpus_in_cluster, cpu) || !per_cpu(ipi_mask, cpu)); - - __cpu_set(cpu, per_cpu(cpus_in_cluster, cpu)); - register_hotcpu_notifier(&x2apic_cpu_notifier); - return 1; + for_each_online_cpu(cpu) { + if (x2apic_cluster(this_cpu) != x2apic_cluster(cpu)) + continue; + __cpu_clear(this_cpu, per_cpu(cpus_in_cluster, cpu)); + __cpu_clear(cpu, per_cpu(cpus_in_cluster, this_cpu)); + } + free_cpumask_var(per_cpu(cpus_in_cluster, this_cpu)); + free_cpumask_var(per_cpu(ipi_mask, this_cpu)); + return 0; } static int x2apic_cluster_probe(void) { - if (x2apic_mode) - return x2apic_init_cpu_notifier(); - else + int cpu = smp_processor_id(); + + if (!x2apic_mode) return 0; + + __cpu_set(cpu, per_cpu(cpus_in_cluster, cpu)); + cpuhp_setup_state(CPUHP_X2APIC_PREPARE, x2apic_prepare_cpu, + x2apic_dead_cpu); + return 1; } static const struct cpumask *x2apic_cluster_target_cpus(void) Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -16,6 +16,7 @@ enum cpuhp_states { CPUHP_HRTIMERS_PREPARE, CPUHP_TIMERS_PREPARE, CPUHP_PROFILE_PREPARE, + CPUHP_X2APIC_PREPARE, CPUHP_NOTIFY_PREPARE, CPUHP_NOTIFY_DEAD, CPUHP_CLOCKEVENTS_DEAD, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 35/40] timers: Convert to hotplug state machine
From: Richard Weinberger Signed-off-by: Richard Weinberger Signed-off-by: Thomas Gleixner --- include/linux/cpuhotplug.h |4 kernel/cpu.c |4 kernel/timer.c | 43 +-- 3 files changed, 13 insertions(+), 38 deletions(-) Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -14,6 +14,7 @@ enum cpuhp_states { CPUHP_WORKQUEUE_PREP, CPUHP_RCUTREE_PREPARE, CPUHP_HRTIMERS_PREPARE, + CPUHP_TIMERS_PREPARE, CPUHP_NOTIFY_PREPARE, CPUHP_NOTIFY_DEAD, CPUHP_CLOCKEVENTS_DEAD, @@ -176,10 +177,13 @@ int clockevents_dead_cpu(unsigned int cp #endif int hrtimers_prepare_cpu(unsigned int cpu); +int timers_prepare_cpu(unsigned int cpu); #ifdef CONFIG_HOTPLUG_CPU int hrtimers_dead_cpu(unsigned int cpu); +int timers_dead_cpu(unsigned int cpu); #else #define hrtimers_dead_cpu NULL +#define timers_dead_cpuNULL #endif #endif Index: linux-2.6/kernel/cpu.c === --- linux-2.6.orig/kernel/cpu.c +++ linux-2.6/kernel/cpu.c @@ -756,6 +756,10 @@ static struct cpuhp_step cpuhp_bp_states .startup = hrtimers_prepare_cpu, .teardown = hrtimers_dead_cpu, }, + [CPUHP_TIMERS_PREPARE] = { + .startup = timers_prepare_cpu, + .teardown = timers_dead_cpu, + }, [CPUHP_NOTIFY_PREPARE] = { .startup = notify_prepare, .teardown = NULL, Index: linux-2.6/kernel/timer.c === --- linux-2.6.orig/kernel/timer.c +++ linux-2.6/kernel/timer.c @@ -1642,7 +1642,7 @@ SYSCALL_DEFINE1(sysinfo, struct sysinfo return 0; } -static int __cpuinit init_timers_cpu(int cpu) +int __cpuinit timers_prepare_cpu(unsigned int cpu) { int j; struct tvec_base *base; @@ -1714,7 +1714,7 @@ static void migrate_timer_list(struct tv } } -static void __cpuinit migrate_timers(int cpu) +int __cpuinit timers_dead_cpu(unsigned int cpu) { struct tvec_base *old_base; struct tvec_base *new_base; @@ -1744,52 +1744,19 @@ static void __cpuinit migrate_timers(int spin_unlock(&old_base->lock); spin_unlock_irq(&new_base->lock); put_cpu_var(tvec_bases); -} -#endif /* CONFIG_HOTPLUG_CPU */ - -static int __cpuinit timer_cpu_notify(struct notifier_block *self, - unsigned long action, void *hcpu) -{ - long cpu = (long)hcpu; - int err; - switch(action) { - case CPU_UP_PREPARE: - case CPU_UP_PREPARE_FROZEN: - err = init_timers_cpu(cpu); - if (err < 0) - return notifier_from_errno(err); - break; -#ifdef CONFIG_HOTPLUG_CPU - case CPU_DEAD: - case CPU_DEAD_FROZEN: - migrate_timers(cpu); - break; -#endif - default: - break; - } - return NOTIFY_OK; + return 0; } - -static struct notifier_block __cpuinitdata timers_nb = { - .notifier_call = timer_cpu_notify, -}; - +#endif /* CONFIG_HOTPLUG_CPU */ void __init init_timers(void) { - int err; - /* ensure there are enough low bits for flags in timer->base pointer */ BUILD_BUG_ON(__alignof__(struct tvec_base) & TIMER_FLAG_MASK); - err = timer_cpu_notify(&timers_nb, (unsigned long)CPU_UP_PREPARE, - (void *)(long)smp_processor_id()); init_timer_stats(); + BUG_ON(timers_prepare_cpu(smp_processor_id())); - BUG_ON(err != NOTIFY_OK); - register_cpu_notifier(&timers_nb); open_softirq(TIMER_SOFTIRQ, run_timer_softirq); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 34/40] cpuhotplug: Remove CPU_DYING notifier
All users gone. Signed-off-by: Thomas Gleixner --- include/linux/cpu.h|6 -- include/linux/cpuhotplug.h |1 - kernel/cpu.c | 11 --- 3 files changed, 18 deletions(-) Index: linux-2.6/include/linux/cpu.h === --- linux-2.6.orig/include/linux/cpu.h +++ linux-2.6/include/linux/cpu.h @@ -60,11 +60,6 @@ extern ssize_t arch_print_cpu_modalias(s #define CPU_DOWN_PREPARE 0x0005 /* CPU (unsigned)v going down */ #define CPU_DOWN_FAILED0x0006 /* CPU (unsigned)v NOT going down */ #define CPU_DEAD 0x0007 /* CPU (unsigned)v dead */ -#define CPU_DYING 0x0008 /* CPU (unsigned)v not running any task, - * not handling interrupts, soon dead. - * Called on the dying cpu, interrupts - * are already disabled. Must not - * sleep, must not fail */ #define CPU_POST_DEAD 0x0009 /* CPU (unsigned)v dead, cpu_hotplug * lock is dropped */ @@ -79,7 +74,6 @@ extern ssize_t arch_print_cpu_modalias(s #define CPU_DOWN_PREPARE_FROZEN(CPU_DOWN_PREPARE | CPU_TASKS_FROZEN) #define CPU_DOWN_FAILED_FROZEN (CPU_DOWN_FAILED | CPU_TASKS_FROZEN) #define CPU_DEAD_FROZEN(CPU_DEAD | CPU_TASKS_FROZEN) -#define CPU_DYING_FROZEN (CPU_DYING | CPU_TASKS_FROZEN) #ifdef CONFIG_SMP extern bool cpuhp_tasks_frozen; Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -29,7 +29,6 @@ enum cpuhp_states { CPUHP_AP_ARM_VFP_STARTING, CPUHP_AP_ARM64_TIMER_STARTING, CPUHP_AP_KVM_STARTING, - CPUHP_AP_NOTIFY_DYING, CPUHP_AP_CLOCKEVENTS_DYING, CPUHP_AP_RCUTREE_DYING, CPUHP_AP_X86_TBOOT_DYING, Index: linux-2.6/kernel/cpu.c === --- linux-2.6.orig/kernel/cpu.c +++ linux-2.6/kernel/cpu.c @@ -303,12 +303,6 @@ static int notify_down_prepare(unsigned return err; } -static int notify_dying(unsigned int cpu) -{ - cpu_notify(CPU_DYING, cpu); - return 0; -} - /* Take this CPU down. */ static int __ref take_cpu_down(void *_param) { @@ -366,7 +360,6 @@ static int notify_dead(unsigned int cpu) #define notify_down_prepareNULL #define takedown_cpu NULL #define notify_deadNULL -#define notify_dying NULL #endif #ifdef CONFIG_HOTPLUG_CPU @@ -825,10 +818,6 @@ static struct cpuhp_step cpuhp_ap_states .startup = sched_starting_cpu, .teardown = NULL, }, - [CPUHP_AP_NOTIFY_DYING] = { - .startup = NULL, - .teardown = notify_dying, - }, [CPUHP_AP_CLOCKEVENTS_DYING] = { .startup = NULL, .teardown = clockevents_dying_cpu, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 33/40] hrtimer: Convert to hotplug state machine
Split out the clockevents callbacks instead of piggypacking them on hrtimers. This gets rid of a POST_DEAD user. See commit 54e88fad. We just move the callback state to the proper place in the state machine. Signed-off-by: Thomas Gleixner --- include/linux/cpuhotplug.h | 18 + kernel/cpu.c | 12 +++ kernel/hrtimer.c | 47 - kernel/time/clockevents.c | 13 4 files changed, 48 insertions(+), 42 deletions(-) Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -13,8 +13,10 @@ enum cpuhp_states { CPUHP_SCHED_MIGRATE_PREP, CPUHP_WORKQUEUE_PREP, CPUHP_RCUTREE_PREPARE, + CPUHP_HRTIMERS_PREPARE, CPUHP_NOTIFY_PREPARE, CPUHP_NOTIFY_DEAD, + CPUHP_CLOCKEVENTS_DEAD, CPUHP_CPUFREQ_DEAD, CPUHP_SCHED_DEAD, CPUHP_BRINGUP_CPU, @@ -28,6 +30,7 @@ enum cpuhp_states { CPUHP_AP_ARM64_TIMER_STARTING, CPUHP_AP_KVM_STARTING, CPUHP_AP_NOTIFY_DYING, + CPUHP_AP_CLOCKEVENTS_DYING, CPUHP_AP_RCUTREE_DYING, CPUHP_AP_X86_TBOOT_DYING, CPUHP_AP_S390_VTIME_DYING, @@ -165,4 +168,19 @@ int rcutree_dying_cpu(unsigned int cpu); #define rcutree_dying_cpu NULL #endif +#ifdef CONFIG_GENERIC_CLOCKEVENTS +int clockevents_dying_cpu(unsigned int cpu); +int clockevents_dead_cpu(unsigned int cpu); +#else +#define clockevents_dying_cpu NULL +#define clockevents_dead_cpu NULL +#endif + +int hrtimers_prepare_cpu(unsigned int cpu); +#ifdef CONFIG_HOTPLUG_CPU +int hrtimers_dead_cpu(unsigned int cpu); +#else +#define hrtimers_dead_cpu NULL +#endif + #endif Index: linux-2.6/kernel/cpu.c === --- linux-2.6.orig/kernel/cpu.c +++ linux-2.6/kernel/cpu.c @@ -759,6 +759,10 @@ static struct cpuhp_step cpuhp_bp_states .startup = rcutree_prepare_cpu, .teardown = rcutree_dead_cpu, }, + [CPUHP_HRTIMERS_PREPARE] = { + .startup = hrtimers_prepare_cpu, + .teardown = hrtimers_dead_cpu, + }, [CPUHP_NOTIFY_PREPARE] = { .startup = notify_prepare, .teardown = NULL, @@ -767,6 +771,10 @@ static struct cpuhp_step cpuhp_bp_states .startup = NULL, .teardown = notify_dead, }, + [CPUHP_CLOCKEVENTS_DEAD] = { + .startup = NULL, + .teardown = clockevents_dead_cpu, + }, [CPUHP_BRINGUP_CPU] = { .startup = bringup_cpu, .teardown = NULL, @@ -821,6 +829,10 @@ static struct cpuhp_step cpuhp_ap_states .startup = NULL, .teardown = notify_dying, }, + [CPUHP_AP_CLOCKEVENTS_DYING] = { + .startup = NULL, + .teardown = clockevents_dying_cpu, + }, [CPUHP_AP_RCUTREE_DYING] = { .startup = NULL, .teardown = rcutree_dying_cpu, Index: linux-2.6/kernel/hrtimer.c === --- linux-2.6.orig/kernel/hrtimer.c +++ linux-2.6/kernel/hrtimer.c @@ -1635,7 +1635,7 @@ SYSCALL_DEFINE2(nanosleep, struct timesp /* * Functions related to boot-time initialization: */ -static void __cpuinit init_hrtimers_cpu(int cpu) +int __cpuinit hrtimers_prepare_cpu(unsigned int cpu) { struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu); int i; @@ -1648,6 +1648,7 @@ static void __cpuinit init_hrtimers_cpu( } hrtimer_init_hres(cpu_base); + return 0; } #ifdef CONFIG_HOTPLUG_CPU @@ -1685,7 +1686,7 @@ static void migrate_hrtimer_list(struct } } -static void migrate_hrtimers(int scpu) +int __cpuinit hrtimers_dead_cpu(unsigned int scpu) { struct hrtimer_cpu_base *old_base, *new_base; int i; @@ -1714,52 +1715,14 @@ static void migrate_hrtimers(int scpu) /* Check, if we got expired work to do */ __hrtimer_peek_ahead_timers(); local_irq_enable(); + return 0; } #endif /* CONFIG_HOTPLUG_CPU */ -static int __cpuinit hrtimer_cpu_notify(struct notifier_block *self, - unsigned long action, void *hcpu) -{ - int scpu = (long)hcpu; - - switch (action) { - - case CPU_UP_PREPARE: - case CPU_UP_PREPARE_FROZEN: - init_hrtimers_cpu(scpu); - break; - -#ifdef CONFIG_HOTPLUG_CPU - case CPU_DYING: - case CPU_DYING_FROZEN: - clockevents_notify(CLOCK_EVT_NOTIFY_CPU_DYING, &scpu); - break; - case CPU_DEAD: - case CPU_DEAD_FROZEN: - { - clockevents_notify(CLOCK_EVT_NOTIFY_CPU_DEA
[patch 32/40] rcu: Convert rcutree to hotplug state machine
Do we really need so many states here ? Signed-off-by: Thomas Gleixner --- include/linux/cpuhotplug.h | 18 kernel/cpu.c | 12 + kernel/rcutree.c | 95 - 3 files changed, 73 insertions(+), 52 deletions(-) Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -12,6 +12,7 @@ enum cpuhp_states { CPUHP_PERF_PREPARE, CPUHP_SCHED_MIGRATE_PREP, CPUHP_WORKQUEUE_PREP, + CPUHP_RCUTREE_PREPARE, CPUHP_NOTIFY_PREPARE, CPUHP_NOTIFY_DEAD, CPUHP_CPUFREQ_DEAD, @@ -27,6 +28,7 @@ enum cpuhp_states { CPUHP_AP_ARM64_TIMER_STARTING, CPUHP_AP_KVM_STARTING, CPUHP_AP_NOTIFY_DYING, + CPUHP_AP_RCUTREE_DYING, CPUHP_AP_X86_TBOOT_DYING, CPUHP_AP_S390_VTIME_DYING, CPUHP_AP_SCHED_NOHZ_DYING, @@ -39,6 +41,7 @@ enum cpuhp_states { CPUHP_SCHED_MIGRATE_ONLINE, CPUHP_WORKQUEUE_ONLINE, CPUHP_CPUFREQ_ONLINE, + CPUHP_RCUTREE_ONLINE, CPUHP_NOTIFY_ONLINE, CPUHP_NOTIFY_DOWN_PREPARE, CPUHP_PERF_X86_UNCORE_ONLINE, @@ -147,4 +150,19 @@ int workqueue_prepare_cpu(unsigned int c int workqueue_online_cpu(unsigned int cpu); int workqueue_offline_cpu(unsigned int cpu); +/* RCUtree hotplug events */ +#if defined(CONFIG_TREE_RCU) || defined(CONFIG_TREE_PREEMPT_RCU) +int rcutree_prepare_cpu(unsigned int cpu); +int rcutree_online_cpu(unsigned int cpu); +int rcutree_offline_cpu(unsigned int cpu); +int rcutree_dead_cpu(unsigned int cpu); +int rcutree_dying_cpu(unsigned int cpu); +#else +#define rcutree_prepare_cpuNULL +#define rcutree_online_cpu NULL +#define rcutree_offline_cpuNULL +#define rcutree_dead_cpu NULL +#define rcutree_dying_cpu NULL +#endif + #endif Index: linux-2.6/kernel/cpu.c === --- linux-2.6.orig/kernel/cpu.c +++ linux-2.6/kernel/cpu.c @@ -755,6 +755,10 @@ static struct cpuhp_step cpuhp_bp_states .startup = workqueue_prepare_cpu, .teardown = NULL, }, + [CPUHP_RCUTREE_PREPARE] = { + .startup = rcutree_prepare_cpu, + .teardown = rcutree_dead_cpu, + }, [CPUHP_NOTIFY_PREPARE] = { .startup = notify_prepare, .teardown = NULL, @@ -787,6 +791,10 @@ static struct cpuhp_step cpuhp_bp_states .startup = workqueue_online_cpu, .teardown = workqueue_offline_cpu, }, + [CPUHP_RCUTREE_ONLINE] = { + .startup = rcutree_online_cpu, + .teardown = rcutree_offline_cpu, + }, [CPUHP_NOTIFY_ONLINE] = { .startup = notify_online, .teardown = NULL, @@ -813,6 +821,10 @@ static struct cpuhp_step cpuhp_ap_states .startup = NULL, .teardown = notify_dying, }, + [CPUHP_AP_RCUTREE_DYING] = { + .startup = NULL, + .teardown = rcutree_dying_cpu, + }, [CPUHP_AP_SCHED_NOHZ_DYING] = { .startup = NULL, .teardown = nohz_balance_exit_idle, Index: linux-2.6/kernel/rcutree.c === --- linux-2.6.orig/kernel/rcutree.c +++ linux-2.6/kernel/rcutree.c @@ -2787,67 +2787,59 @@ rcu_init_percpu_data(int cpu, struct rcu mutex_unlock(&rsp->onoff_mutex); } -static void __cpuinit rcu_prepare_cpu(int cpu) +int __cpuinit rcutree_prepare_cpu(unsigned int cpu) { struct rcu_state *rsp; for_each_rcu_flavor(rsp) rcu_init_percpu_data(cpu, rsp, strcmp(rsp->name, "rcu_preempt") == 0); + rcu_prepare_kthreads(cpu); + return 0; } -/* - * Handle CPU online/offline notification events. - */ -static int __cpuinit rcu_cpu_notify(struct notifier_block *self, - unsigned long action, void *hcpu) +int __cpuinit rcutree_dead_cpu(unsigned int cpu) { - long cpu = (long)hcpu; - struct rcu_data *rdp = per_cpu_ptr(rcu_state->rda, cpu); - struct rcu_node *rnp = rdp->mynode; struct rcu_state *rsp; - int ret = NOTIFY_OK; - trace_rcu_utilization("Start CPU hotplug"); - switch (action) { - case CPU_UP_PREPARE: - case CPU_UP_PREPARE_FROZEN: - rcu_prepare_cpu(cpu); - rcu_prepare_kthreads(cpu); - break; - case CPU_ONLINE: - case CPU_DOWN_FAILED: - rcu_boost_kthread_setaffinity(rnp, -1); - break; - case CPU_DOWN_PREPARE: - if (nocb_cpu_expendable(cpu)) - rcu_boost_kthread_setaffi
[patch 03/40] stop_machine: Use smpboot threads
Use the smpboot thread infrastructure. Mark the stopper thread selfparking and park it after it has finished the take_cpu_down() work. Signed-off-by: Thomas Gleixner --- kernel/cpu.c |2 kernel/stop_machine.c | 134 ++ 2 files changed, 51 insertions(+), 85 deletions(-) Index: linux-2.6/kernel/cpu.c === --- linux-2.6.orig/kernel/cpu.c +++ linux-2.6/kernel/cpu.c @@ -254,6 +254,8 @@ static int __ref take_cpu_down(void *_pa return err; cpu_notify(CPU_DYING | param->mod, param->hcpu); + /* Park the stopper thread */ + kthread_park(current); return 0; } Index: linux-2.6/kernel/stop_machine.c === --- linux-2.6.orig/kernel/stop_machine.c +++ linux-2.6/kernel/stop_machine.c @@ -18,7 +18,7 @@ #include #include #include - +#include #include /* @@ -245,20 +245,25 @@ int try_stop_cpus(const struct cpumask * return ret; } -static int cpu_stopper_thread(void *data) +static int cpu_stop_should_run(unsigned int cpu) +{ + struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu); + unsigned long flags; + int run; + + spin_lock_irqsave(&stopper->lock, flags); + run = !list_empty(&stopper->works); + spin_unlock_irqrestore(&stopper->lock, flags); + return run; +} + +static void cpu_stopper_thread(unsigned int cpu) { - struct cpu_stopper *stopper = data; + struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu); struct cpu_stop_work *work; int ret; repeat: - set_current_state(TASK_INTERRUPTIBLE); /* mb paired w/ kthread_stop */ - - if (kthread_should_stop()) { - __set_current_state(TASK_RUNNING); - return 0; - } - work = NULL; spin_lock_irq(&stopper->lock); if (!list_empty(&stopper->works)) { @@ -274,8 +279,6 @@ repeat: struct cpu_stop_done *done = work->done; char ksym_buf[KSYM_NAME_LEN] __maybe_unused; - __set_current_state(TASK_RUNNING); - /* cpu stop callbacks are not allowed to sleep */ preempt_disable(); @@ -291,87 +294,55 @@ repeat: ksym_buf), arg); cpu_stop_signal_done(done, true); - } else - schedule(); - - goto repeat; + goto repeat; + } } extern void sched_set_stop_task(int cpu, struct task_struct *stop); -/* manage stopper for a cpu, mostly lifted from sched migration thread mgmt */ -static int __cpuinit cpu_stop_cpu_callback(struct notifier_block *nfb, - unsigned long action, void *hcpu) +static void cpu_stop_create(unsigned int cpu) +{ + sched_set_stop_task(cpu, per_cpu(cpu_stopper_task, cpu)); +} + +static void cpu_stop_park(unsigned int cpu) { - unsigned int cpu = (unsigned long)hcpu; struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu); - struct task_struct *p = per_cpu(cpu_stopper_task, cpu); + struct cpu_stop_work *work; + unsigned long flags; - switch (action & ~CPU_TASKS_FROZEN) { - case CPU_UP_PREPARE: - BUG_ON(p || stopper->enabled || !list_empty(&stopper->works)); - p = kthread_create_on_node(cpu_stopper_thread, - stopper, - cpu_to_node(cpu), - "migration/%d", cpu); - if (IS_ERR(p)) - return notifier_from_errno(PTR_ERR(p)); - get_task_struct(p); - kthread_bind(p, cpu); - sched_set_stop_task(cpu, p); - per_cpu(cpu_stopper_task, cpu) = p; - break; + /* drain remaining works */ + spin_lock_irqsave(&stopper->lock, flags); + list_for_each_entry(work, &stopper->works, list) + cpu_stop_signal_done(work->done, false); + stopper->enabled = false; + spin_unlock_irqrestore(&stopper->lock, flags); +} - case CPU_ONLINE: - /* strictly unnecessary, as first user will wake it */ - wake_up_process(p); - /* mark enabled */ - spin_lock_irq(&stopper->lock); - stopper->enabled = true; - spin_unlock_irq(&stopper->lock); - break; - -#ifdef CONFIG_HOTPLUG_CPU - case CPU_UP_CANCELED: - case CPU_POST_DEAD: - { - struct cpu_stop_work *work; - - sched_set_stop_task(cpu, NULL); - /* kill the stopper */ - kthread_stop(p); - /* drain
[patch 11/40] x86: uncore: Move teardown callback to CPU_DEAD
No point calling this from the dying cpu. Signed-off-by: Thomas Gleixner --- arch/x86/kernel/cpu/perf_event_intel_uncore.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) Index: linux-2.6/arch/x86/kernel/cpu/perf_event_intel_uncore.c === --- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_intel_uncore.c +++ linux-2.6/arch/x86/kernel/cpu/perf_event_intel_uncore.c @@ -2622,7 +2622,7 @@ static void __init uncore_pci_exit(void) } } -static void __cpuinit uncore_cpu_dying(int cpu) +static void __cpuinit uncore_cpu_dead(int cpu) { struct intel_uncore_type *type; struct intel_uncore_pmu *pmu; @@ -2803,8 +2803,8 @@ static int uncore_cpu_starting(cpu); break; case CPU_UP_CANCELED: - case CPU_DYING: - uncore_cpu_dying(cpu); + case CPU_DEAD: + uncore_cpu_dead(cpu); break; default: break; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 12/40] x86: uncore: Convert to hotplug state machine
Convert the notifiers to state machine states and let the core code do the setup for the already online cpus. This notifier has a completely undocumented ordering requirement versus perf hardcoded in the notifier priority. Move the callback to the proper place in the state machine. Note, the original code did not check the return values of the setup functions and I could not be bothered to twist my brain around undoing the previous steps. Marked with a FIXME. Signed-off-by: Thomas Gleixner --- arch/x86/kernel/cpu/perf_event_intel_uncore.c | 109 ++ include/linux/cpuhotplug.h|3 2 files changed, 30 insertions(+), 82 deletions(-) Index: linux-2.6/arch/x86/kernel/cpu/perf_event_intel_uncore.c === --- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_intel_uncore.c +++ linux-2.6/arch/x86/kernel/cpu/perf_event_intel_uncore.c @@ -2622,7 +2622,7 @@ static void __init uncore_pci_exit(void) } } -static void __cpuinit uncore_cpu_dead(int cpu) +static int __cpuinit uncore_dead_cpu(unsigned int cpu) { struct intel_uncore_type *type; struct intel_uncore_pmu *pmu; @@ -2639,9 +2639,11 @@ static void __cpuinit uncore_cpu_dead(in kfree(box); } } + return 0; } -static int __cpuinit uncore_cpu_starting(int cpu) +/* Must run on the target cpu */ +static int __cpuinit uncore_starting_cpu(unsigned int cpu) { struct intel_uncore_type *type; struct intel_uncore_pmu *pmu; @@ -2681,12 +2683,12 @@ static int __cpuinit uncore_cpu_starting return 0; } -static int __cpuinit uncore_cpu_prepare(int cpu, int phys_id) +static int __cpuinit uncore_prepare_cpu(unsigned int cpu) { struct intel_uncore_type *type; struct intel_uncore_pmu *pmu; struct intel_uncore_box *box; - int i, j; + int i, j, phys_id = -1; for (i = 0; msr_uncores[i]; i++) { type = msr_uncores[i]; @@ -2745,13 +2747,13 @@ uncore_change_context(struct intel_uncor } } -static void __cpuinit uncore_event_exit_cpu(int cpu) +static int __cpuinit uncore_offline_cpu(unsigned int cpu) { int i, phys_id, target; /* if exiting cpu is used for collecting uncore events */ if (!cpumask_test_and_clear_cpu(cpu, &uncore_cpu_mask)) - return; + return 0; /* find a new cpu to collect uncore events */ phys_id = topology_physical_package_id(cpu); @@ -2771,78 +2773,29 @@ static void __cpuinit uncore_event_exit_ uncore_change_context(msr_uncores, cpu, target); uncore_change_context(pci_uncores, cpu, target); + return 0; } -static void __cpuinit uncore_event_init_cpu(int cpu) +static int __cpuinit uncore_online_cpu(unsigned int cpu) { int i, phys_id; phys_id = topology_physical_package_id(cpu); for_each_cpu(i, &uncore_cpu_mask) { if (phys_id == topology_physical_package_id(i)) - return; + return 0; } cpumask_set_cpu(cpu, &uncore_cpu_mask); uncore_change_context(msr_uncores, -1, cpu); uncore_change_context(pci_uncores, -1, cpu); -} - -static int - __cpuinit uncore_cpu_notifier(struct notifier_block *self, unsigned long action, void *hcpu) -{ - unsigned int cpu = (long)hcpu; - - /* allocate/free data structure for uncore box */ - switch (action & ~CPU_TASKS_FROZEN) { - case CPU_UP_PREPARE: - uncore_cpu_prepare(cpu, -1); - break; - case CPU_STARTING: - uncore_cpu_starting(cpu); - break; - case CPU_UP_CANCELED: - case CPU_DEAD: - uncore_cpu_dead(cpu); - break; - default: - break; - } - - /* select the cpu that collects uncore events */ - switch (action & ~CPU_TASKS_FROZEN) { - case CPU_DOWN_FAILED: - case CPU_STARTING: - uncore_event_init_cpu(cpu); - break; - case CPU_DOWN_PREPARE: - uncore_event_exit_cpu(cpu); - break; - default: - break; - } - - return NOTIFY_OK; -} - -static struct notifier_block uncore_cpu_nb __cpuinitdata = { - .notifier_call = uncore_cpu_notifier, - /* -* to migrate uncore events, our notifier should be executed -* before perf core's notifier. -*/ - .priority = CPU_PRI_PERF + 1, -}; - -static void __init uncore_cpu_setup(void *dummy) -{ - uncore_cpu_starting(smp_processor_id()); + return 0; } static int __init uncore_cpu_init(void) { - int ret, cpu, max_cores; + int ret, max_cores; max_cores = boot_cpu_data.x86_max_cores; switch (boot_cpu_data.x86_model) { @@ -2879,28 +283
[patch 31/40] sched: Convert fair nohz balancer to hotplug state machine
Straight forward conversion which leaves the question whether this couldn't be combined with already existing infrastructure in the scheduler instead of having an extra state. Signed-off-by: Thomas Gleixner --- include/linux/cpuhotplug.h |6 ++ kernel/cpu.c |4 kernel/sched/fair.c| 16 ++-- 3 files changed, 12 insertions(+), 14 deletions(-) Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -29,6 +29,7 @@ enum cpuhp_states { CPUHP_AP_NOTIFY_DYING, CPUHP_AP_X86_TBOOT_DYING, CPUHP_AP_S390_VTIME_DYING, + CPUHP_AP_SCHED_NOHZ_DYING, CPUHP_AP_SCHED_MIGRATE_DYING, CPUHP_AP_MAX, CPUHP_TEARDOWN_CPU, @@ -126,6 +127,11 @@ int sched_migration_dead_cpu(unsigned in #define sched_migration_dying_cpu NULL #define sched_migration_dead_cpu NULL #endif +#if defined(CONFIG_NO_HZ) +int nohz_balance_exit_idle(unsigned int cpu); +#else +#define nohz_balance_exit_idle NULL +#endif /* Performance counter hotplug functions */ #ifdef CONFIG_PERF_EVENTS Index: linux-2.6/kernel/cpu.c === --- linux-2.6.orig/kernel/cpu.c +++ linux-2.6/kernel/cpu.c @@ -813,6 +813,10 @@ static struct cpuhp_step cpuhp_ap_states .startup = NULL, .teardown = notify_dying, }, + [CPUHP_AP_SCHED_NOHZ_DYING] = { + .startup = NULL, + .teardown = nohz_balance_exit_idle, + }, [CPUHP_AP_SCHED_MIGRATE_DYING] = { .startup = NULL, .teardown = sched_migration_dying_cpu, Index: linux-2.6/kernel/sched/fair.c === --- linux-2.6.orig/kernel/sched/fair.c +++ linux-2.6/kernel/sched/fair.c @@ -5390,13 +5390,14 @@ static void nohz_balancer_kick(int cpu) return; } -static inline void nohz_balance_exit_idle(int cpu) +int nohz_balance_exit_idle(unsigned int cpu) { if (unlikely(test_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu { cpumask_clear_cpu(cpu, nohz.idle_cpus_mask); atomic_dec(&nohz.nr_cpus); clear_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu)); } + return 0; } static inline void set_cpu_sd_state_busy(void) @@ -5448,18 +5449,6 @@ void nohz_balance_enter_idle(int cpu) atomic_inc(&nohz.nr_cpus); set_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu)); } - -static int __cpuinit sched_ilb_notifier(struct notifier_block *nfb, - unsigned long action, void *hcpu) -{ - switch (action & ~CPU_TASKS_FROZEN) { - case CPU_DYING: - nohz_balance_exit_idle(smp_processor_id()); - return NOTIFY_OK; - default: - return NOTIFY_DONE; - } -} #endif static DEFINE_SPINLOCK(balancing); @@ -6167,7 +6156,6 @@ __init void init_sched_fair_class(void) #ifdef CONFIG_NO_HZ nohz.next_balance = jiffies; zalloc_cpumask_var(&nohz.idle_cpus_mask, GFP_NOWAIT); - cpu_notifier(sched_ilb_notifier, 0); #endif #endif /* SMP */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 29/40] s390: Convert vtime to hotplug state machine
Signed-off-by: Thomas Gleixner --- arch/s390/kernel/vtime.c | 18 +- include/linux/cpuhotplug.h |1 + 2 files changed, 6 insertions(+), 13 deletions(-) Index: linux-2.6/arch/s390/kernel/vtime.c === --- linux-2.6.orig/arch/s390/kernel/vtime.c +++ linux-2.6/arch/s390/kernel/vtime.c @@ -382,25 +382,17 @@ void __cpuinit init_cpu_vtimer(void) set_vtimer(VTIMER_MAX_SLICE); } -static int __cpuinit s390_nohz_notify(struct notifier_block *self, - unsigned long action, void *hcpu) +static int __cpuinit s390_vtime_dying_cpu(unsigned int cpu) { - struct s390_idle_data *idle; - long cpu = (long) hcpu; + struct s390_idle_data *idle = &per_cpu(s390_idle, cpu); - idle = &per_cpu(s390_idle, cpu); - switch (action & ~CPU_TASKS_FROZEN) { - case CPU_DYING: - idle->nohz_delay = 0; - default: - break; - } - return NOTIFY_OK; + idle->nohz_delay = 0; + return 0; } void __init vtime_init(void) { /* Enable cpu timer interrupts on the boot cpu. */ init_cpu_vtimer(); - cpu_notifier(s390_nohz_notify, 0); + cpuhp_setup_state(CPUHP_AP_S390_VTIME_DYING, NULL, s390_vtime_dying_cpu); } Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -27,6 +27,7 @@ enum cpuhp_states { CPUHP_AP_ARM64_TIMER_STARTING, CPUHP_AP_KVM_STARTING, CPUHP_AP_NOTIFY_DYING, + CPUHP_AP_S390_VTIME_DYING, CPUHP_AP_SCHED_MIGRATE_DYING, CPUHP_AP_MAX, CPUHP_TEARDOWN_CPU, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 28/40] cpuhotplug: Remove CPU_STARTING notifier
All users converted to state machine. Signed-off-by: Thomas Gleixner --- include/linux/cpu.h|6 -- include/linux/cpuhotplug.h |1 - kernel/cpu.c | 13 + 3 files changed, 1 insertion(+), 19 deletions(-) Index: linux-2.6/include/linux/cpu.h === --- linux-2.6.orig/include/linux/cpu.h +++ linux-2.6/include/linux/cpu.h @@ -67,10 +67,6 @@ extern ssize_t arch_print_cpu_modalias(s * sleep, must not fail */ #define CPU_POST_DEAD 0x0009 /* CPU (unsigned)v dead, cpu_hotplug * lock is dropped */ -#define CPU_STARTING 0x000A /* CPU (unsigned)v soon running. - * Called on the new cpu, just before - * enabling interrupts. Must not sleep, - * must not fail */ /* Used for CPU hotplug events occurring while tasks are frozen due to a suspend * operation in progress @@ -84,8 +80,6 @@ extern ssize_t arch_print_cpu_modalias(s #define CPU_DOWN_FAILED_FROZEN (CPU_DOWN_FAILED | CPU_TASKS_FROZEN) #define CPU_DEAD_FROZEN(CPU_DEAD | CPU_TASKS_FROZEN) #define CPU_DYING_FROZEN (CPU_DYING | CPU_TASKS_FROZEN) -#define CPU_STARTING_FROZEN(CPU_STARTING | CPU_TASKS_FROZEN) - #ifdef CONFIG_SMP extern bool cpuhp_tasks_frozen; Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -26,7 +26,6 @@ enum cpuhp_states { CPUHP_AP_ARM_VFP_STARTING, CPUHP_AP_ARM64_TIMER_STARTING, CPUHP_AP_KVM_STARTING, - CPUHP_AP_NOTIFY_STARTING, CPUHP_AP_NOTIFY_DYING, CPUHP_AP_SCHED_MIGRATE_DYING, CPUHP_AP_MAX, Index: linux-2.6/kernel/cpu.c === --- linux-2.6.orig/kernel/cpu.c +++ linux-2.6/kernel/cpu.c @@ -216,12 +216,6 @@ static int bringup_cpu(unsigned int cpu) return 0; } -static int notify_starting(unsigned int cpu) -{ - cpu_notify(CPU_STARTING, cpu); - return 0; -} - #ifdef CONFIG_HOTPLUG_CPU EXPORT_SYMBOL(register_cpu_notifier); @@ -446,10 +440,9 @@ EXPORT_SYMBOL(cpu_down); #endif /*CONFIG_HOTPLUG_CPU*/ /** - * notify_cpu_starting(cpu) - call the CPU_STARTING notifiers + * notify_cpu_starting(cpu) - Invoke the callbacks on the starting CPU * @cpu: cpu that just started * - * This function calls the cpu_chain notifiers with CPU_STARTING. * It must be called by the arch code on the new cpu, before the new cpu * enables interrupts and before the "boot" cpu returns from __cpu_up(). */ @@ -816,10 +809,6 @@ static struct cpuhp_step cpuhp_ap_states .startup = sched_starting_cpu, .teardown = NULL, }, - [CPUHP_AP_NOTIFY_STARTING] = { - .startup = notify_starting, - .teardown = NULL, - }, [CPUHP_AP_NOTIFY_DYING] = { .startup = NULL, .teardown = notify_dying, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 26/40] arm: perf: Convert to hotplug state machine
Straight forward conversion w/o bells and whistles. Signed-off-by: Thomas Gleixner --- arch/arm/kernel/perf_event_cpu.c | 28 +--- include/linux/cpuhotplug.h |1 + 2 files changed, 6 insertions(+), 23 deletions(-) Index: linux-2.6/arch/arm/kernel/perf_event_cpu.c === --- linux-2.6.orig/arch/arm/kernel/perf_event_cpu.c +++ linux-2.6/arch/arm/kernel/perf_event_cpu.c @@ -157,24 +157,13 @@ static void cpu_pmu_init(struct arm_pmu * UNKNOWN at reset, the PMU must be explicitly reset to avoid reading * junk values out of them. */ -static int __cpuinit cpu_pmu_notify(struct notifier_block *b, - unsigned long action, void *hcpu) +static int __cpuinit arm_perf_starting_cpu(unsigned int cpu) { - if ((action & ~CPU_TASKS_FROZEN) != CPU_STARTING) - return NOTIFY_DONE; - if (cpu_pmu && cpu_pmu->reset) cpu_pmu->reset(cpu_pmu); - else - return NOTIFY_DONE; - - return NOTIFY_OK; + return 0; } -static struct notifier_block __cpuinitdata cpu_pmu_hotplug_notifier = { - .notifier_call = cpu_pmu_notify, -}; - /* * PMU platform driver and devicetree bindings. */ @@ -304,16 +293,9 @@ static struct platform_driver cpu_pmu_dr static int __init register_pmu_driver(void) { - int err; - - err = register_cpu_notifier(&cpu_pmu_hotplug_notifier); - if (err) - return err; - - err = platform_driver_register(&cpu_pmu_driver); - if (err) - unregister_cpu_notifier(&cpu_pmu_hotplug_notifier); + int err = platform_driver_register(&cpu_pmu_driver); - return err; + return err ? err : cpuhp_setup_state_nocalls(CPUHP_AP_PERF_ARM_STARTING, +arm_perf_starting_cpu, NULL); } device_initcall(register_pmu_driver); Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -22,6 +22,7 @@ enum cpuhp_states { CPUHP_AP_PERF_X86_UNCORE_STARTING, CPUHP_AP_PERF_X86_AMD_IBS_STARTING, CPUHP_AP_PERF_X86_STARTING, + CPUHP_AP_PERF_ARM_STARTING, CPUHP_AP_ARM_VFP_STARTING, CPUHP_AP_ARM64_TIMER_STARTING, CPUHP_AP_NOTIFY_STARTING, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 25/40] arm: Convert VFP hotplug notifiers to state machine
Straight forward conversion plus commentry why code which is executed in hotplug callbacks needs to be invoked before installing them. Signed-off-by: Thomas Gleixner --- arch/arm/vfp/vfpmodule.c | 29 + include/linux/cpuhotplug.h |1 + 2 files changed, 18 insertions(+), 12 deletions(-) Index: linux-2.6/arch/arm/vfp/vfpmodule.c === --- linux-2.6.orig/arch/arm/vfp/vfpmodule.c +++ linux-2.6/arch/arm/vfp/vfpmodule.c @@ -633,19 +633,19 @@ int vfp_restore_user_hwstate(struct user * hardware state at every thread switch. We clear our held state when * a CPU has been killed, indicating that the VFP hardware doesn't contain * a threads VFP state. When a CPU starts up, we re-enable access to the - * VFP hardware. - * - * Both CPU_DYING and CPU_STARTING are called on the CPU which + * VFP hardware. The callbacks below are called on the CPU which * is being offlined/onlined. */ -static int vfp_hotplug(struct notifier_block *b, unsigned long action, - void *hcpu) +static int __cpuinit vfp_dying_cpu(unsigned int cpu) { - if (action == CPU_DYING || action == CPU_DYING_FROZEN) { - vfp_force_reload((long)hcpu, current_thread_info()); - } else if (action == CPU_STARTING || action == CPU_STARTING_FROZEN) - vfp_enable(NULL); - return NOTIFY_OK; + vfp_force_reload(cpu, current_thread_info()); + return 0; +} + +static int __cpuinit vfp_starting_cpu(unsigned int unused) +{ + vfp_enable(NULL); + return 0; } /* @@ -653,9 +653,13 @@ static int vfp_hotplug(struct notifier_b */ static int __init vfp_init(void) { - unsigned int vfpsid; unsigned int cpu_arch = cpu_architecture(); + unsigned int vfpsid; + /* +* Enable the access to the VFP on all online cpus so the +* following test on FPSID will succeed. +*/ if (cpu_arch >= CPU_ARCH_ARMv6) on_each_cpu(vfp_enable, NULL, 1); @@ -676,7 +680,8 @@ static int __init vfp_init(void) else if (vfpsid & FPSID_NODOUBLE) { pr_cont("no double precision support\n"); } else { - hotcpu_notifier(vfp_hotplug, 0); + cpuhp_setup_state_nocall(CPUHP_AP_ARM_VFP_STARTING, +vfp_starting_cpu, vfp_dying_cpu); VFP_arch = (vfpsid & FPSID_ARCH_MASK) >> FPSID_ARCH_BIT; /* Extract the architecture version */ pr_cont("implementor %02x architecture %d part %02x variant %x rev %x\n", Index: linux-2.6/include/linux/cpuhotplug.h === --- linux-2.6.orig/include/linux/cpuhotplug.h +++ linux-2.6/include/linux/cpuhotplug.h @@ -22,6 +22,7 @@ enum cpuhp_states { CPUHP_AP_PERF_X86_UNCORE_STARTING, CPUHP_AP_PERF_X86_AMD_IBS_STARTING, CPUHP_AP_PERF_X86_STARTING, + CPUHP_AP_ARM_VFP_STARTING, CPUHP_AP_ARM64_TIMER_STARTING, CPUHP_AP_NOTIFY_STARTING, CPUHP_AP_NOTIFY_DYING, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/