from:"Thomas"

Latest kernel doesn't boot

2008-02-05 Thread thomas

The latest linux kernel doesn't boot on my computer
(h=21511abd0a248a3f225d3b611cfabb93124605a7).

elilo hangs while booting this kernel. 2.6.24 works.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Latest kernel doesn't boot

2008-02-05 Thread thomas

Zitat von "H. Peter Anvin" <[EMAIL PROTECTED]>:

> [EMAIL PROTECTED] wrote:
> > The latest linux kernel doesn't boot on my computer
> > (h=21511abd0a248a3f225d3b611cfabb93124605a7).
> > 
> > elilo hangs while booting this kernel. 2.6.24 works.
> 
> Wow, so we know it's affected with EFI, since you're using elilo.
> 
> You gave absolutely zero other information about your system or what 
> "doesn't boot" mean.

It's a macbook pro with a core duo processor (the one with only 32bits).

Doesn't boot means: i enter the name of the kernel image i want to boot in elilo
and press enter and nothing happens, it just hangs.

i need to press the power button for a few seconds to power off.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

question regarding user mode linux

2008-01-13 Thread thomas

I try to setup a user mode linux session and dmesg says:
"[cut]
 ubda: unknown partition table
Choosing a random ethernet address for device eth0
Netdevice 0 (9e:69:a0:f3:f1:f0) : TUN/TAP backend - IP = 192.168.5.1
Filesystem "ubda": Disabling barriers, not supported by the underlying device
[cut]"

id est eth0.

But the network device is:
# cat /proc/net/dev
Inter-|   Receive|  Transmit
 face |bytespackets errs drop fifo frame compressed multicast|bytes   
packets errs drop fifo colls carrier compressed
lo:   0   0000 0  0 00  
0000 0   0  0
  eth6:   0   0000 0  0 00  
0000 0   0  0

id est eth6

# uname -a
Linux localhost 2.6.24-rc7-gd0c4c9d4 #1 Sat Jan 12 13:25:44 CET 2008 i686 UML
User Mode Linux GNU/Linux

Any ideas what could be wrong here?

mfg
thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[oops] ipppd/isdn

2001-06-29 Thread thomas

t symbol   <=
>>EIP; c010e33f<=
Trace; c011415c <__run_task_queue+50/64>
Trace; c012b47a <__wait_on_buffer+6a/8c>
Trace; c012b61e 
Trace; c012b73a 
Trace; c0192d1c 
Trace; c0192d7a 
Trace; c011018e 
Trace; c0112c6b 
Trace; c010d658 
Trace; c01070be 
Trace; c010d9a7 
Trace; c010d658 
Trace; c010f389 
Trace; c010fd59 
Trace; c01057c0 
Trace; c0106c58 
Code;  c010e33f 
 <_EIP>:
Code;  c010e33f<=
   0:   0f 0b ud2a  <=
Code;  c010e341 
   2:   8d 65 dc  lea0xffdc(%ebp),%esp
Code;  c010e344 
   5:   5bpop%ebx
Code;  c010e345 
   6:   5epop%esi
Code;  c010e346 
   7:   5fpop%edi
Code;  c010e347 
   8:   89 ec mov%ebp,%esp
Code;  c010e349 
   a:   5dpop%ebp
Code;  c010e34a 
   b:   c3ret
Code;  c010e34b 
   c:   90nop
Code;  c010e34c <__wake_up+0/90>
   d:   55push   %ebp
Code;  c010e34d <__wake_up+1/90>
   e:   89 e5 mov%esp,%ebp
Code;  c010e34f <__wake_up+3/90>
  10:   83 ec 0c  sub$0xc,%esp
Code;  c010e352 <__wake_up+6/90>
  13:   57push   %edi

 <0>Kernel panic: aiee, killing interrupt handler!

3 warnings issued.  Results may not be reliable.


[3] syslog:

Jun 29 23:28:49 knecht ipppd[237]: Modem hangup
Jun 29 23:28:49 knecht ipppd[237]: Connection terminated.
Jun 29 23:28:49 knecht ipppd[237]: taking down PHASE_DEAD link 0, linkunit: 0
Jun 29 23:28:49 knecht ipppd[237]: closing fd 8 from unit 0
Jun 29 23:28:49 knecht ipppd[237]: link 0 closed , linkunit: 0
Jun 29 23:28:49 knecht ipppd[237]: reinit_unit: 0
Jun 29 23:28:49 knecht ipppd[237]: Connect[0]: /dev/ippp0, fd: 8
Jun 29 23:28:49 knecht kernel: ippp_ccp: freeing reset data structure c352c000
Jun 29 23:28:49 knecht kernel: ippp, open, slot: 0, minor: 0, state: 
Jun 29 23:28:49 knecht kernel: ippp_ccp: allocated reset data structure c352c000
Jun 29 23:28:49 knecht ipppd[237]: Modem hangup
Jun 29 23:28:49 knecht ipppd[237]: Connection terminated.
Jun 29 23:28:49 knecht ipppd[237]: taking down PHASE_DEAD link 1, linkunit: 1
Jun 29 23:28:49 knecht ipppd[237]: closing fd 9 from unit 1
Jun 29 23:28:49 knecht ipppd[237]: link 1 closed , linkunit: 1
Jun 29 23:28:49 knecht ipppd[237]: reinit_unit: 1
Jun 29 23:28:49 knecht ipppd[237]: Connect[1]: /dev/ippp1, fd: 9
Jun 29 23:28:49 knecht kernel: ippp_ccp: freeing reset data structure c352c800
Jun 29 23:28:49 knecht kernel: ippp, open, slot: 1, minor: 1, state: 
Jun 29 23:28:49 knecht kernel: ippp_ccp: allocated reset data structure c352c800
Jun 29 23:30:21 knecht ipppd[237]: Local number: x, Remote number: x,
Jun 29 23:30:21 knecht ipppd[237]: PHASE_WAIT -> PHASE_ESTABLISHED, ifunit: 0, l
Jun 29 23:30:21 knecht ipppd[237]: Remote message:
Jun 29 23:30:21 knecht ipppd[237]: MPPP negotiation, He: Yes We: Yes


thx,
 thomas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: pdflush stuck in D state with v2.6.24-rc1-192-gef49c32

2007-11-11 Thread Thomas

Florin Iucha  iucha.net> writes:

> > It's really curious - I tried your .config and commands, and still
> > could not trigger the high iowait. I'm running 64bit Intel Core 2,
> > and kernel 2.6.24-rc1-git6 with the above patch.
> 
> Curious but 100% reproducible, at least on my box.  What I'm going to
> try is booting into the kernel with your patch and just doing the find
> / md5sum.  It would be really interesting if the read-only access
> triggers it.
> 
> florin
> 

I can confirm this issue too on any .24-rc. I'm also using reiserfs on a LVM.

And there is one more user on Gentoo forums having the same issue.
http://forums.gentoo.org/viewtopic-t-612959.html

So you are not alone, florian.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

New patch: drm-populated memory types

2007-01-23 Thread thomas


This one incorporates some of Arjan's suggestions and a fix for the 
i810 problem introduced with the previous patch.

/Thomas


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] agpgart: Allow drm-populated agp memory types

2007-01-23 Thread thomas

From: Thomas Hellstrom <[EMAIL PROTECTED]>

This patch allows drm to populate an agpgart structure with pages of its own.
It's needed for the new drm memory manager which dynamically flips pages in and 
out of AGP.

The patch modifies the generic functions as well as the intel agp driver. The 
intel drm driver is
currently the only one supporting the new memory manager.

Other agp drivers may need some minor fixing up once they have a corresponding 
memory manager enabled drm driver.

AGP memory types >= AGP_USER_TYPES are not populated by the agpgart driver, but 
the drm is expected
to do that, as well as taking care of cache- and tlb flushing when needed.

It's not possible to request these types from user space using agpgart ioctls.

The Intel driver also gets a new memory type for pages that can be bound cached 
to the intel GTT.
Signed-off-by: Thomas Hellstrom <[EMAIL PROTECTED]>
---
 drivers/char/agp/agp.h  |   10 ++
 drivers/char/agp/ali-agp.c  |2 
 drivers/char/agp/alpha-agp.c|4 +
 drivers/char/agp/amd-k7-agp.c   |1 
 drivers/char/agp/amd64-agp.c|   11 ++
 drivers/char/agp/ati-agp.c  |1 
 drivers/char/agp/backend.c  |2 
 drivers/char/agp/efficeon-agp.c |1 
 drivers/char/agp/frontend.c |3 +
 drivers/char/agp/generic.c  |  130 ++-
 drivers/char/agp/hp-agp.c   |1 
 drivers/char/agp/i460-agp.c |7 +
 drivers/char/agp/intel-agp.c|  186 +--
 drivers/char/agp/nvidia-agp.c   |1 
 drivers/char/agp/sgi-agp.c  |1 
 drivers/char/agp/sworks-agp.c   |1 
 drivers/char/agp/uninorth-agp.c |2 
 drivers/char/agp/via-agp.c  |2 
 include/linux/agp_backend.h |5 +
 19 files changed, 296 insertions(+), 75 deletions(-)

diff --git a/drivers/char/agp/agp.h b/drivers/char/agp/agp.h
index 1d59e2a..f821243 100644
--- a/drivers/char/agp/agp.h
+++ b/drivers/char/agp/agp.h
@@ -114,6 +114,7 @@ struct agp_bridge_driver {
void (*free_by_type)(struct agp_memory *);
void *(*agp_alloc_page)(struct agp_bridge_data *);
void (*agp_destroy_page)(void *);
+int (*agp_type_to_mask_type) (struct agp_bridge_data *, int);
 };
 
 struct agp_bridge_data {
@@ -218,6 +219,7 @@ #define I810_PTE_BASE   0x1
 #define I810_PTE_MAIN_UNCACHED 0x
 #define I810_PTE_LOCAL 0x0002
 #define I810_PTE_VALID 0x0001
+#define I830_PTE_SYSTEM_CACHED  0x0006
 #define I810_SMRAM_MISCC   0x70
 #define I810_GFX_MEM_WIN_SIZE  0x0001
 #define I810_GFX_MEM_WIN_32M   0x0001
@@ -270,8 +272,16 @@ void global_cache_flush(void);
 void get_agp_version(struct agp_bridge_data *bridge);
 unsigned long agp_generic_mask_memory(struct agp_bridge_data *bridge,
unsigned long addr, int type);
+int agp_generic_type_to_mask_type(struct agp_bridge_data *bridge,
+ int type);
 struct agp_bridge_data *agp_generic_find_bridge(struct pci_dev *pdev);
 
+/* generic functions for user-populated AGP memory types */
+struct agp_memory *agp_generic_alloc_user(size_t page_count, int type);
+void agp_alloc_page_array(size_t size, struct agp_memory *mem);
+void agp_free_page_array(struct agp_memory *mem);
+
+
 /* generic routines for agp>=3 */
 int agp3_generic_fetch_size(void);
 void agp3_generic_tlbflush(struct agp_memory *mem);
diff --git a/drivers/char/agp/ali-agp.c b/drivers/char/agp/ali-agp.c
index 5a31ec7..98177a9 100644
--- a/drivers/char/agp/ali-agp.c
+++ b/drivers/char/agp/ali-agp.c
@@ -214,6 +214,7 @@ static struct agp_bridge_driver ali_gene
.free_by_type   = agp_generic_free_by_type,
.agp_alloc_page = agp_generic_alloc_page,
.agp_destroy_page   = ali_destroy_page,
+   .agp_type_to_mask_type  = agp_generic_type_to_mask_type,
 };
 
 static struct agp_bridge_driver ali_m1541_bridge = {
@@ -237,6 +238,7 @@ static struct agp_bridge_driver ali_m154
.free_by_type   = agp_generic_free_by_type,
.agp_alloc_page = m1541_alloc_page,
.agp_destroy_page   = m1541_destroy_page,
+   .agp_type_to_mask_type  = agp_generic_type_to_mask_type,
 };
 
 
diff --git a/drivers/char/agp/alpha-agp.c b/drivers/char/agp/alpha-agp.c
index b4e00a3..b0acf41 100644
--- a/drivers/char/agp/alpha-agp.c
+++ b/drivers/char/agp/alpha-agp.c
@@ -91,6 +91,9 @@ static int alpha_core_agp_insert_memory(
int num_entries, status;
void *temp;
 
+   if (type >= AGP_USER_TYPES || mem->type >= AGP_USER_TYPES)
+   return -EINVAL;
+
temp = agp_bridge->current_size;
num_entries = A_SIZE_FIX(temp)->num_entries;
if ((pg_start + mem->page_count) > num_entries)
@@ -142,6 +145,7 @@ struct agp_bridge_driver alpha_core_agp_
.free_by_type   = agp_generic_free_by_type,
.agp_alloc_page = agp_generic_alloc_page,
.

[PATCH] agpgart: Allow drm-populated agp memory types update2

2007-02-05 Thread thomas

From: Thomas Hellstrom <[EMAIL PROTECTED]>

This patch allows drm to populate an agpgart structure with pages of its 
own.
It's needed for the new drm memory manager which dynamically flips pages in 
and out of AGP.

The patch modifies the generic functions as well as the intel agp driver. 
The intel drm driver is
currently the only one supporting the new memory manager.

Other agp drivers may need some minor fixing up once they have a 
corresponding memory manager enabled drm driver.

AGP memory types >= AGP_USER_TYPES are not populated by the agpgart driver, 
but the drm is expected
to do that, as well as taking care of cache- and tlb flushing when needed.

It's not possible to request these types from user space using agpgart 
ioctls.

The Intel driver also gets a new memory type for pages that can be bound 
cached to the intel GTT.
Signed-off-by: Thomas Hellstrom <[EMAIL PROTECTED]>
---
 drivers/char/agp/agp.h  |   10 ++
 drivers/char/agp/ali-agp.c  |2 
 drivers/char/agp/alpha-agp.c|4 +
 drivers/char/agp/amd-k7-agp.c   |1 
 drivers/char/agp/amd64-agp.c|   11 ++
 drivers/char/agp/ati-agp.c  |1 
 drivers/char/agp/backend.c  |2 
 drivers/char/agp/efficeon-agp.c |1 
 drivers/char/agp/frontend.c |3 +
 drivers/char/agp/generic.c  |  130 ++-
 drivers/char/agp/hp-agp.c   |1 
 drivers/char/agp/i460-agp.c |7 +
 drivers/char/agp/intel-agp.c|  186 +--
 drivers/char/agp/nvidia-agp.c   |1 
 drivers/char/agp/parisc-agp.c   |1 
 drivers/char/agp/sgi-agp.c  |1 
 drivers/char/agp/sis-agp.c  |1 
 drivers/char/agp/sworks-agp.c   |1 
 drivers/char/agp/uninorth-agp.c |2 
 drivers/char/agp/via-agp.c  |2 
 include/linux/agp_backend.h |5 +
 21 files changed, 298 insertions(+), 75 deletions(-)

diff --git a/drivers/char/agp/agp.h b/drivers/char/agp/agp.h
index 1d59e2a..f821243 100644
--- a/drivers/char/agp/agp.h
+++ b/drivers/char/agp/agp.h
@@ -114,6 +114,7 @@ struct agp_bridge_driver {
void (*free_by_type)(struct agp_memory *);
void *(*agp_alloc_page)(struct agp_bridge_data *);
void (*agp_destroy_page)(void *);
+int (*agp_type_to_mask_type) (struct agp_bridge_data *, int);
 };
 
 struct agp_bridge_data {
@@ -218,6 +219,7 @@ #define I810_PTE_BASE   0x1
 #define I810_PTE_MAIN_UNCACHED 0x
 #define I810_PTE_LOCAL 0x0002
 #define I810_PTE_VALID 0x0001
+#define I830_PTE_SYSTEM_CACHED  0x0006
 #define I810_SMRAM_MISCC   0x70
 #define I810_GFX_MEM_WIN_SIZE  0x0001
 #define I810_GFX_MEM_WIN_32M   0x0001
@@ -270,8 +272,16 @@ void global_cache_flush(void);
 void get_agp_version(struct agp_bridge_data *bridge);
 unsigned long agp_generic_mask_memory(struct agp_bridge_data *bridge,
unsigned long addr, int type);
+int agp_generic_type_to_mask_type(struct agp_bridge_data *bridge,
+ int type);
 struct agp_bridge_data *agp_generic_find_bridge(struct pci_dev *pdev);
 
+/* generic functions for user-populated AGP memory types */
+struct agp_memory *agp_generic_alloc_user(size_t page_count, int type);
+void agp_alloc_page_array(size_t size, struct agp_memory *mem);
+void agp_free_page_array(struct agp_memory *mem);
+
+
 /* generic routines for agp>=3 */
 int agp3_generic_fetch_size(void);
 void agp3_generic_tlbflush(struct agp_memory *mem);
diff --git a/drivers/char/agp/ali-agp.c b/drivers/char/agp/ali-agp.c
index 5a31ec7..98177a9 100644
--- a/drivers/char/agp/ali-agp.c
+++ b/drivers/char/agp/ali-agp.c
@@ -214,6 +214,7 @@ static struct agp_bridge_driver ali_gene
.free_by_type   = agp_generic_free_by_type,
.agp_alloc_page = agp_generic_alloc_page,
.agp_destroy_page   = ali_destroy_page,
+   .agp_type_to_mask_type  = agp_generic_type_to_mask_type,
 };
 
 static struct agp_bridge_driver ali_m1541_bridge = {
@@ -237,6 +238,7 @@ static struct agp_bridge_driver ali_m154
.free_by_type   = agp_generic_free_by_type,
.agp_alloc_page = m1541_alloc_page,
.agp_destroy_page   = m1541_destroy_page,
+   .agp_type_to_mask_type  = agp_generic_type_to_mask_type,
 };
 
 
diff --git a/drivers/char/agp/alpha-agp.c b/drivers/char/agp/alpha-agp.c
index b4e00a3..b0acf41 100644
--- a/drivers/char/agp/alpha-agp.c
+++ b/drivers/char/agp/alpha-agp.c
@@ -91,6 +91,9 @@ static int alpha_core_agp_insert_memory(
int num_entries, status;
void *temp;
 
+   if (type >= AGP_USER_TYPES || mem->type >= AGP_USER_TYPES)
+   return -EINVAL;
+
temp = agp_bridge->current_size;
num_entries = A_SIZE_FIX(temp)->num_entries;
if ((pg_start + mem->page_count) > num_entries)
@@ -142,6 +145,7 @@ struct agp_br

[PATCH] [AGPGART] Add agp-type-to-mask-type method missing from some drivers.

2007-02-05 Thread thomas

From: Thomas Hellstrom <[EMAIL PROTECTED]>

diff --git a/drivers/char/agp/parisc-agp.c b/drivers/char/agp/parisc-agp.c
index 17c50b0..b7b4590 100644
--- a/drivers/char/agp/parisc-agp.c
+++ b/drivers/char/agp/parisc-agp.c
@@ -228,6 +228,7 @@ struct agp_bridge_driver parisc_agp_driv
.free_by_type   = agp_generic_free_by_type,
.agp_alloc_page = agp_generic_alloc_page,
.agp_destroy_page   = agp_generic_destroy_page,
+   .agp_type_to_mask_type  = agp_generic_type_to_mask_type,
.cant_use_aperture  = 1,
 };
 
diff --git a/drivers/char/agp/sis-agp.c b/drivers/char/agp/sis-agp.c
index a00fd48..60342b7 100644
--- a/drivers/char/agp/sis-agp.c
+++ b/drivers/char/agp/sis-agp.c
@@ -140,6 +140,7 @@ static struct agp_bridge_driver sis_driv
.free_by_type   = agp_generic_free_by_type,
.agp_alloc_page = agp_generic_alloc_page,
.agp_destroy_page   = agp_generic_destroy_page,
+   .agp_type_to_mask_type  = agp_generic_type_to_mask_type,
 };
 
 static struct agp_device_ids sis_agp_device_ids[] __devinitdata =
-- 
1.4.1


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

agpgart: drm-populated memory types

2007-01-08 Thread thomas

Dave and Arjan,

I'm resending a slightly reworked version of the apgart patch for drm-populated
memory types.

The address- based vmalloc / vfree has been replaced and encapsulated in
agp-vkmalloc / agp vkfree which both takes a flag argument to indicate
whether to use vmalloc or kmalloc. This, at least, gets rid of the 
portability problem, and the chances of running into trouble in the future
will be small if all allocs / frees of these memory areas are done using
these functions.

A short recap why I belive the kmalloc / vmalloc construct is necessary:

0) The current code uses vmalloc only.
1) The allocated area ranges from 4 bytes possibly up to 512 kB, depending on
on the size of the AGP buffer allocated.
2) Large buffers are very few. Small buffers tend to be quite many. 
   If we continue to use vmalloc only or another page-based scheme we will
   waste approx one page per buffer, together with the added slowness of
   vmalloc. This will severely hurt applications with a lot of small
   texture buffers.

Please let me know if you still consider this unacceptable. 
In that case I suggest sticking with vmalloc for now.

Also please let me know if there are other parths of the patch that should be
reworked.

The patch that follows is against Dave's agpgart repo.

Regards,

Thomas




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] agpgart: Allow drm-populated agp memory types

2007-01-09 Thread thomas

From: Thomas Hellstrom <[EMAIL PROTECTED]>

This patch allows drm to populate an agpgart structure with pages of its own.
It's needed for the new drm memory manager which dynamically flips pages in and 
out of AGP.

The patch modifies the generic functions as well as the intel agp driver. The 
intel drm driver is
currently the only one supporting the new memory manager.

Other agp drivers may need some minor fixing up once they have a corresponding 
memory manager enabled drm driver.

AGP memory types >= AGP_USER_TYPES are not populated by the agpgart driver, but 
the drm is expected
to do that, as well as taking care of cache- and tlb flushing when needed.

It's not possible to request these types from user space using agpgart ioctls.

The Intel driver also gets a new memory type for pages that can be bound cached 
to the intel GTT.
Signed-off-by: Thomas Hellstrom <[EMAIL PROTECTED]>
---
 drivers/char/agp/agp.h  |   10 ++
 drivers/char/agp/ali-agp.c  |2 
 drivers/char/agp/alpha-agp.c|4 +
 drivers/char/agp/amd-k7-agp.c   |1 
 drivers/char/agp/amd64-agp.c|   11 ++
 drivers/char/agp/ati-agp.c  |1 
 drivers/char/agp/backend.c  |2 
 drivers/char/agp/efficeon-agp.c |1 
 drivers/char/agp/frontend.c |3 +
 drivers/char/agp/generic.c  |  133 +++-
 drivers/char/agp/hp-agp.c   |1 
 drivers/char/agp/i460-agp.c |7 +
 drivers/char/agp/intel-agp.c|  185 +--
 drivers/char/agp/nvidia-agp.c   |1 
 drivers/char/agp/sgi-agp.c  |1 
 drivers/char/agp/sworks-agp.c   |1 
 drivers/char/agp/uninorth-agp.c |2 
 drivers/char/agp/via-agp.c  |2 
 include/linux/agp_backend.h |5 +
 19 files changed, 298 insertions(+), 75 deletions(-)

diff --git a/drivers/char/agp/agp.h b/drivers/char/agp/agp.h
index 1d59e2a..7c75389 100644
--- a/drivers/char/agp/agp.h
+++ b/drivers/char/agp/agp.h
@@ -114,6 +114,7 @@ struct agp_bridge_driver {
void (*free_by_type)(struct agp_memory *);
void *(*agp_alloc_page)(struct agp_bridge_data *);
void (*agp_destroy_page)(void *);
+int (*agp_type_to_mask_type) (struct agp_bridge_data *, int);
 };
 
 struct agp_bridge_data {
@@ -218,6 +219,7 @@ #define I810_PTE_BASE   0x1
 #define I810_PTE_MAIN_UNCACHED 0x
 #define I810_PTE_LOCAL 0x0002
 #define I810_PTE_VALID 0x0001
+#define I830_PTE_SYSTEM_CACHED  0x0006
 #define I810_SMRAM_MISCC   0x70
 #define I810_GFX_MEM_WIN_SIZE  0x0001
 #define I810_GFX_MEM_WIN_32M   0x0001
@@ -270,8 +272,16 @@ void global_cache_flush(void);
 void get_agp_version(struct agp_bridge_data *bridge);
 unsigned long agp_generic_mask_memory(struct agp_bridge_data *bridge,
unsigned long addr, int type);
+int agp_generic_type_to_mask_type(struct agp_bridge_data *bridge,
+ int type);
 struct agp_bridge_data *agp_generic_find_bridge(struct pci_dev *pdev);
 
+/* generic functions for user-populated AGP memory types */
+struct agp_memory *agp_generic_alloc_user(size_t page_count, int type);
+void agp_vkmalloc(size_t size, unsigned long **addr, u8 *vmalloc_flag);
+void agp_vkfree(unsigned long *addr, u8 vmalloc_flag);
+
+
 /* generic routines for agp>=3 */
 int agp3_generic_fetch_size(void);
 void agp3_generic_tlbflush(struct agp_memory *mem);
diff --git a/drivers/char/agp/ali-agp.c b/drivers/char/agp/ali-agp.c
index 5a31ec7..98177a9 100644
--- a/drivers/char/agp/ali-agp.c
+++ b/drivers/char/agp/ali-agp.c
@@ -214,6 +214,7 @@ static struct agp_bridge_driver ali_gene
.free_by_type   = agp_generic_free_by_type,
.agp_alloc_page = agp_generic_alloc_page,
.agp_destroy_page   = ali_destroy_page,
+   .agp_type_to_mask_type  = agp_generic_type_to_mask_type,
 };
 
 static struct agp_bridge_driver ali_m1541_bridge = {
@@ -237,6 +238,7 @@ static struct agp_bridge_driver ali_m154
.free_by_type   = agp_generic_free_by_type,
.agp_alloc_page = m1541_alloc_page,
.agp_destroy_page   = m1541_destroy_page,
+   .agp_type_to_mask_type  = agp_generic_type_to_mask_type,
 };
 
 
diff --git a/drivers/char/agp/alpha-agp.c b/drivers/char/agp/alpha-agp.c
index b4e00a3..b0acf41 100644
--- a/drivers/char/agp/alpha-agp.c
+++ b/drivers/char/agp/alpha-agp.c
@@ -91,6 +91,9 @@ static int alpha_core_agp_insert_memory(
int num_entries, status;
void *temp;
 
+   if (type >= AGP_USER_TYPES || mem->type >= AGP_USER_TYPES)
+   return -EINVAL;
+
temp = agp_bridge->current_size;
num_entries = A_SIZE_FIX(temp)->num_entries;
if ((pg_start + mem->page_count) > num_entries)
@@ -142,6 +145,7 @@ struct agp_bridge_driver alpha_core_agp_
.free_by_type   = agp_generic_free_by_type,
.agp_alloc_page =

Re: 2.6.19-rc6-rt5

2006-11-25 Thread Thomas

Something is really wrong with page alloc on this one. Compiled 2.6.19-rc6-rt5
with the one patch to page_alloc.c as posted on the list here.

Kernel uses around 50% mem and 30% swap without doing anything.
I get a lot of these:

X invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
 [] out_of_memory+0x176/0x1d0
 [] __alloc_pages+0x286/0x2f0
 [] __get_free_pages+0x46/0x60
 [] __pollwait+0xb0/0x100
 [] unix_poll+0xc6/0xd0
 [] sock_poll+0x23/0x30
 [] do_select+0x288/0x4c0
 [] __pollwait+0x0/0x100
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] default_wake_function+0x0/0x20
 [] core_sys_select+0x223/0x360
 [] __schedule+0x2e9/0x6b0
 [] convert_fxsr_from_user+0x22/0xf0
 [] sys_select+0xff/0x1e0
 [] sys_gettimeofday+0x3b/0x90
 [] sysenter_past_esp+0x56/0x79
 ===
Mem-info:
DMA per-cpu:
CPU0: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, btch:   1 usd:   0
Normal per-cpu:
CPU0: Hot: hi:  186, btch:  31 usd:  31   Cold: hi:   62, btch:  15 usd:  58
HighMem per-cpu:
CPU0: Hot: hi:  186, btch:  31 usd:  66   Cold: hi:   62, btch:  15 usd:  14
Active:111463 inactive:36109 dirty:0 writeback:0 unstable:0 free:4018
slab:163934 mapped:26114 pagetables:874
DMA free:3560kB min:68kB low:84kB high:100kB active:396kB inactive:356kB
present:16256kB pages_scanned:1370 all_unreclaimable? yes
lowmem_reserve[]: 0 873 1254
Normal free:3720kB min:3744kB low:4680kB high:5616kB active:111304kB
inactive:108296kB present:894080kB pages_scanned:339028 all_unreclaimable? yes
lowmem_reserve[]: 0 0 3047
HighMem free:8792kB min:380kB low:788kB high:1196kB active:334152kB
inactive:35784kB present:390084kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 0*4kB 1*8kB 0*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB
0*4096kB = 3560kB
Normal: 0*4kB 5*8kB 0*16kB 1*32kB 1*64kB 0*128kB 0*256kB 1*512kB 1*1024kB
1*2048kB 0*4096kB = 3720kB
HighMem: 924*4kB 517*8kB 40*16kB 2*32kB 0*64kB 0*128kB 1*256kB 0*512kB 0*1024kB
0*2048kB 0*4096kB = 8792kB
Swap cache: add 107141, delete 56933, find 4493/5856, race 0+0
Free swap  = 113440kB
Total swap = 488336kB
Free swap:   113440kB
327664 pages of RAM
98288 pages of HIGHMEM
4383 reserved pages
94253 pages shared
50208 pages swap cached
0 pages dirty
0 pages writeback
26114 pages mapped
163934 pages slab
874 pages pagetables
327664 pages of RAM
98288 pages of HIGHMEM
4383 reserved pages
94253 pages shared
50208 pages swap cached
0 pages dirty
0 pages writeback
26114 pages mapped
163934 pages slab
874 pages pagetables
audacious invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
 [] out_of_memory+0x176/0x1d0
 [] __alloc_pages+0x286/0x2f0
 [] cache_alloc_refill+0x30e/0x5d0
 [] kmem_cache_alloc+0x57/0x60
 [] sock_alloc_inode+0x19/0x60
 [] alloc_inode+0x19/0x190
 [] fget_light+0x85/0xa0
 [] new_inode+0x16/0x90
 [] sock_alloc+0x14/0x70
 [] sys_accept+0x56/0x270
 [] do_notify_resume+0x402/0x750
 [] convert_fxsr_from_user+0x22/0xf0
 [] sys_socketcall+0xd1/0x280
 [] sysenter_past_esp+0x56/0x79
 ===
Mem-info:
DMA per-cpu:
CPU0: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, btch:   1 usd:   0
Normal per-cpu:
CPU0: Hot: hi:  186, btch:  31 usd:  31   Cold: hi:   62, btch:  15 usd:  58
HighMem per-cpu:
CPU0: Hot: hi:  186, btch:  31 usd:  66   Cold: hi:   62, btch:  15 usd:  14
Active:111494 inactive:36078 dirty:0 writeback:0 unstable:0 free:4018
slab:163934 mapped:26114 pagetables:874
DMA free:3560kB min:68kB low:84kB high:100kB active:396kB inactive:356kB
present:16256kB pages_scanned:1370 all_unreclaimable? yes
lowmem_reserve[]: 0 873 1254
Normal free:3720kB min:3744kB low:4680kB high:5616kB active:111420kB
inactive:108180kB present:894080kB pages_scanned:339127 all_unreclaimable? yes
lowmem_reserve[]: 0 0 3047
HighMem free:8792kB min:380kB low:788kB high:1196kB active:334160kB
inactive:35776kB present:390084kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 0*4kB 1*8kB 0*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB
0*4096kB = 3560kB
Normal: 0*4kB 5*8kB 0*16kB 1*32kB 1*64kB 0*128kB 0*256kB 1*512kB 1*1024kB
1*2048kB 0*4096kB = 3720kB
HighMem: 924*4kB 517*8kB 40*16kB 2*32kB 0*64kB 0*128kB 1*256kB 0*512kB 0*1024kB
0*2048kB 0*4096kB = 8792kB
Swap cache: add 107141, delete 56933, find 4493/5856, race 0+0
Free swap  = 113440kB
Total swap = 488336kB
Free swap:

[BUG] unable to handle kernel NULL pointer dereference at virtual address 00000003

2007-05-16 Thread thomas

This Oops happens under heavy load with 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7b104bcb8e460e45a1aebe3da9b86aacdb4cab12
 head.

Also i run the powertop tool in parallel of the build process.

This is what i could capture with netconsole:

"BUG: unable to handle kernel NULL pointer dereference at virtual address
0003
 printing eip:
c0136e32
*pde = 
Oops:  [#1]
SMP 
Modules linked in: applesmc evdev snd_seq snd_seq_device"

System.map says:
c0136d47 t tick_nohz_handler
c0136e30 t match_entries
c0136e5a t tstats_open

match_entries is part of kernel/time/timer_stats.c
and only used within timer_stats.c

I guess without stack values this is hard to debug...

I cc'ed Ingo Molnar and Thomas Gleixner as they signed of the initial patch for
the timer_stats support:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=82f67cd9fca8c8762c15ba7ed0d5747588c1e221

mfg
thomas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2.6.21-rc3 regression - bug 8066

2007-03-08 Thread thomas

Hello.

Regarding bug 8066 (http://bugzilla.kernel.org/show_bug.cgi?id=8066):

Is there a particular reason, that prevents the patch becoming part of the
2.6.21 release?

Without this patch i have no battery icon and this is a regression againt
2.6.20.

with kind regards
thomas



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [5/6] 2.6.21-rc3: known regressions

2007-03-13 Thread thomas

> Subject : suspend/resume hangs until keypress
> References : http://bugzilla.kernel.org/show_bug.cgi?id=8181
> Submitter : Tomas Janousek <[EMAIL PROTECTED]>
> Status : unknown

Can you please try to compile without nohz and without hrtimers and try it
again?

This is maybe the same error i encounter. See also:
http://www.ussg.iu.edu/hypermail/linux/kernel/0703.1/1506.html

with kind regards
thomas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: evdev* devices change major/minors after suspend/resume (udev?)

2007-04-11 Thread thomas

Zitat von Soeren Sonnenburg <[EMAIL PROTECTED]>:

> 
> Very concrete it is this evdev that may be missing... and just FYI this
> also seems to cause trouble in Xorg - sometimes the appletouch mouse is
> not yet back...
> 
>
/dev/input/by-id/usb-Apple_Computer_Apple_Internal_Keyboard_._Trackpad-event-kbd
> -> ../event5
> 
> Any hints welcome,

See also this discussion thread:

http://www.uwsg.iu.edu/hypermail/linux/kernel/0703.3/0988.html

But there is no solution for this problem right now.

with kind regards
thomas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Bonded magnets for your Motor

2015-10-19 Thread Thomas

Dear Sir/Madam,

This is Thomas from topmag,Shenzhen,China.

From your website, we know that our magnet products may be used for your 
products.

Our branch company also manufacture and supply raw materials of high calss 
praseodymium-neodymium-iron alloy, dysprosium, chromium metal etc. We are 
specialized in sintered & bonded NdFeB magnets, SmCo magnets, Alnico magnets, 
ferrite magnets and various kinds of magnetic assemblies.

Looking foward to hearing from you.

Thomas --- Sales Engineer  
Mobile: 0086-15889706837
Address: Dalang street,Longhua district, Shenzhen,PRC 
Tel:86-755-29019871 
E-mail: topm...@163.com

xpad_probe: undefined reference to `led_classdev_register'

2007-09-26 Thread thomas

Hi.

Current linus' git tree:

x86_64-unknown-linux-gnu-ld: BFD 2.15 assertion fail
/home/thomas/source/crosstool-0.43/build/x86_64-unknown-linux-gnu/gcc-3.4.5-glibc-2.3.6/binutils-2.15/bfd/linker.c:619
drivers/built-in.o(.text+0x20749d): In function `xpad_probe':
: undefined reference to `led_classdev_register'
drivers/built-in.o(.text+0x20756c): In function `xpad_disconnect':
: undefined reference to `led_classdev_unregister'
make: *** [.tmp_vmlinux1] Fehler 1

any ideas?

mfg
thomas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Magnetic assembles for your electric motor

2015-06-17 Thread Thomas

Dear,

This is Thomas from Chinese magnets company.

My friend introduce me,most of your items may use the bonded magnets or 
injection magnets.

Maybe we can help you for the magnet items,if you need,please feel free to let 
us know.

Thanks

Thomas --- Sales Engineer  
Mobile: 0086-15889706837
Address: Dalang street,Longhua district, Shenzhen,PRC 
Tel:86-755-29019871 
E-mail: 
topm...@163.comn�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�&j:+v�����赙zZ+��+zf＂�h���~i���z��wア�?�ㄨ��&�)撷f��^j谦y�m��@A�a囤�
0鹅h���i

Re: [i915] BUG: Bad page state in process Xorg

2013-11-25 Thread thomas

Hi,

It turns out that this seems to be a bug in udl DRM driver.

I bisected the problem to this patch:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/udl?id=5dc9e1e87229cb786a5bb58ddd0d60fee6eb4641

With kind regards
Thomas

Am 22.11.2013 17:18 schrieb Daniel Vetter :
>
> On Fri, Nov 22, 2013 at 4:54 PM, Thomas Meyer  wrote: 
> >> Am 22.11.2013 um 11:55 schrieb Daniel Vetter : 
> >> 
> >> On Fri, Nov 22, 2013 at 11:36 AM, Dave Airlie  wrote: 
> >>>> Hi, 
> >>> 
> >>> cc'ing mailing list, 
> >>> 
> >>> Daniel any ideas? 
> >> 
> >> Nope, not really :( And no ideas how to triage this further - if it 
> >> takes 9 days to hit it eventually we'll have a real hard time. Or does 
> >> this happen even after just a short X run? 
> > 
> > Seems to happen every time while stopping the x server. Also after a short 
> > run time. 
> > 
> > The current fedora 3.11 kernel doesn't show this bug. I'm using fedora 19, 
> > with a self compiled kernel. 
> > 
> > I did turn on config-debug-pagealloc but this didn't show any wrongness. 
>
> In that case I think the bisect is the fastest way to insight - atm 
> I'm really at loss what could be wrong here. 
> -Daniel 
> -- 
> Daniel Vetter 
> Software Engineer, Intel Corporation 
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch 
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Manufacturer of magnets/Supplier of Bosch

2015-08-20 Thread Thomas

Dear sir,

How are you? This is Thomas from China Topmagtech Ltd, We get your contact from 
my friend and knowing that you are buying some precised magnetized magnets for 
your item.

We are professional manufacture Sintered NdFeB & Magnetic Assemblies.
Our advantage is: Precise magnetization, complicated shape and magnetic 
assembly ect.
 
Hope have chance to make long term cooperation with you. We can make our 
competitive price and free samples for you.
 
Any question or request please feel free to contact us.

Bests Regards

Thomas --- Sales Engineer  
Mobile: 0086-15889706837
Address: New Sanyo Motor Industrial Park,  Haoyi Villiage, Shajing Town, Baoan 
District, Shenzhen, China 518104
Tel:(86)755-29019871
Fax:(86) 755-29735987
Email: i...@topmagtech.com&topm...@163.com

Injection magnets for your Automation

2016-02-15 Thread Thomas

Dear,
 
Good day.
 
May I have your  attention?We are a professional manufacturer&exporter of 
magnet and magnetic  products located in Shenzhen China,with two production 
line  the annual  capacity exceeds 1,000 tons.
 
Thanks to skilled  workers and engineers,some of them having more than 15 years 
experience in  magnet industry,we are able to provide you good quality magnets 
and  professionalism consultancy.
 
Especially we are  competitive in series N35-N50,H,SH,UH series  etc..,that are 
widely used in motors,sensors,speakers,generators,wind  turbines and other 
electric or industrial devices.
 
Below are some pictures of our facilities,you can  also view more product 
information at our topmag website.
 
If we could be of any  further assistance,pls don't hesitate to contact us 
directly,thank  you.

Bests Regards

Thomas --- Sales Engineer  
Mobile: 0086-15889706837
Address: New Sanyo Motor Industrial Park,  Haoyi Villiage, Shajing Town, Baoan 
District, Shenzhen, China 518104
Tel:(86)755-29019871
Fax:(86) 755-29735987
Email: i...@topmagtech.com&topm...@163.com

Re:magnet materials for your auto industry

2016-03-10 Thread Thomas

Dear,

This is Thomas from Chinese magnets company.

My friend introduce me,most of your items may use the bonded magnets or 
injection magnets.

Maybe we can help you for the magnet item,if you need,please feel free to let 
us know.

Thanks

Thomas --- Sales Engineer  
Mobile: 0086-15889706837
Address: New Sanyo Motor Industrial Park,  Haoyi Villiage, Shajing Town, Baoan 
District, Shenzhen, China 518104
Tel:(86)755-29019871
Fax:(86) 755-29735987
Email: i...@topmagtech.com&topm...@163.com

Re:Motor parts magnet vendor-bonded and injection magnets/Topmagtech

2016-03-21 Thread Thomas

Dear purchasing manager, 

This is Thomas from Chinese magnet company.

As you may know the price of raw material has been higher, it would be a good 
timing for you to purchasing magnet for your items. 
 
Feel free to contact me for any further questions or enquiry.

Bests Regards

Thomas --- Sales Engineer  
Mobile: 0086-15889706837
Address: Dalang street,Longhua district, Shenzhen,PRC 
Tel:86-755-29019871 
E-mail: topm...@163.com

Supplying High Quality NdFeB Materials

2016-02-19 Thread Thomas

Dear,
 
Good day.
 
May I have your  attention?We are a professional manufacturer&exporter of 
magnet and magnetic  products located in Shenzhen China,with two production 
line  the annual  capacity exceeds 1,000 tons.
 
Thanks to skilled  workers and engineers,some of them having more than 15 years 
experience in  magnet industry,we are able to provide you good quality magnets 
and  professionalism consultancy.
 
Especially we are  competitive in series N35-N50,H,SH,UH series  etc..,that are 
widely used in motors,sensors,speakers,generators,wind  turbines and other 
electric or industrial devices.
 
Below are some pictures of our facilities,you can  also view more product 
information at our topmag website.
 
If we could be of any  further assistance,pls don't hesitate to contact us 
directly,thank  you.

Bests Regards

Thomas --- Sales Engineer  
Mobile: 0086-15889706837
Address: New Sanyo Motor Industrial Park,  Haoyi Villiage, Shajing Town, Baoan 
District, Shenzhen, China 518104
Tel:(86)755-29019871
Fax:(86) 755-29735987
Email: i...@topmagtech.com&topm...@163.com

Re: [PATCH 3/3] regulator: add device tree support for max8997

2012-11-26 Thread Thomas Abraham

On 26 November 2012 19:41, Mark Brown
 wrote:
> On Mon, Nov 26, 2012 at 07:16:04PM +0530, Thomas Abraham wrote:
>
>> and this patch applied cleanly. Could you please let me know if there
>> is anything I need to be doing differently for this.
>
> Hrm, try applying it on the relevant topic branch.  Your comments about
> rebasing on top of MFD changes did suggest that there's something in the
> MFD tree so I didn't check terribly closely.

I tried applying this patch on the max8997 branch in your regulator
tree. But this patch does not apply cleanly on that branch because
commits "5eb9f2b96381" (regulator: remove use of __devexit_p),
"a5023574d120" (regulator: remove use of __devinit) and "8dc995f56ef7"
(regulator: remove use of __devexit) are not available on this branch
but these commits are already in your for-next branch.

I am not sure if it is of any help in rebasing this patch to the
existing max8997 branch. If you could suggest on how I could prepare
this patch so that applies cleanly for you, I could do that.

Thanks,
Thomas.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] regulator: add device tree support for max8997

2012-11-27 Thread Thomas Abraham

Add device tree based discovery support for max8997.

Cc: Karol Lewandowski 
Cc: Rajendra Nayak 
Cc: Rob Herring 
Cc: Grant Likely 
Signed-off-by: Thomas Abraham 
Acked-by: MyungJoo Ham 
Reviewed-by: Tomasz Figa 
---
This patch is based on 'topic/max8997' branch of Mark Brown's
regulator tree.

 .../bindings/regulator/max8997-regulator.txt   |  146 +++
 drivers/mfd/max8997.c  |   73 ++-
 drivers/regulator/max8997.c|  148 +++-
 include/linux/mfd/max8997-private.h|1 +
 include/linux/mfd/max8997.h|1 +
 5 files changed, 366 insertions(+), 3 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/regulator/max8997-regulator.txt

diff --git a/Documentation/devicetree/bindings/regulator/max8997-regulator.txt 
b/Documentation/devicetree/bindings/regulator/max8997-regulator.txt
new file mode 100644
index 000..9fd69a1
--- /dev/null
+++ b/Documentation/devicetree/bindings/regulator/max8997-regulator.txt
@@ -0,0 +1,146 @@
+* Maxim MAX8997 Voltage and Current Regulator
+
+The Maxim MAX8997 is a multi-function device which includes volatage and
+current regulators, rtc, charger controller and other sub-blocks. It is
+interfaced to the host controller using a i2c interface. Each sub-block is
+addressed by the host system using different i2c slave address. This document
+describes the bindings for 'pmic' sub-block of max8997.
+
+Required properties:
+- compatible: Should be "maxim,max8997-pmic".
+- reg: Specifies the i2c slave address of the pmic block. It should be 0x66.
+
+- max8997,pmic-buck1-dvs-voltage: A set of 8 voltage values in micro-volt (uV)
+  units for buck1 when changing voltage using gpio dvs. Refer to [1] below
+  for additional information.
+
+- max8997,pmic-buck2-dvs-voltage: A set of 8 voltage values in micro-volt (uV)
+  units for buck2 when changing voltage using gpio dvs. Refer to [1] below
+  for additional information.
+
+- max8997,pmic-buck5-dvs-voltage: A set of 8 voltage values in micro-volt (uV)
+  units for buck5 when changing voltage using gpio dvs. Refer to [1] below
+  for additional information.
+
+[1] If none of the 'max8997,pmic-buck[1/2/5]-uses-gpio-dvs' optional
+property is specified, the 'max8997,pmic-buck[1/2/5]-dvs-voltage'
+property should specify atleast one voltage level (which would be a
+safe operating voltage).
+
+If either of the 'max8997,pmic-buck[1/2/5]-uses-gpio-dvs' optional
+property is specified, then all the eigth voltage values for the
+'max8997,pmic-buck[1/2/5]-dvs-voltage' should be specified.
+
+Optional properties:
+- interrupt-parent: Specifies the phandle of the interrupt controller to which
+  the interrupts from max8997 are delivered to.
+- interrupts: Interrupt specifiers for two interrupt sources.
+  - First interrupt specifier is for 'irq1' interrupt.
+  - Second interrupt specifier is for 'alert' interrupt.
+- max8997,pmic-buck1-uses-gpio-dvs: 'buck1' can be controlled by gpio dvs.
+- max8997,pmic-buck2-uses-gpio-dvs: 'buck2' can be controlled by gpio dvs.
+- max8997,pmic-buck5-uses-gpio-dvs: 'buck5' can be controlled by gpio dvs.
+
+Additional properties required if either of the optional properties are used:
+- max8997,pmic-ignore-gpiodvs-side-effect: When GPIO-DVS mode is used for
+  multiple bucks, changing the voltage value of one of the bucks may affect
+  that of another buck, which is the side effect of the change (set_voltage).
+  Use this property to ignore such side effects and change the voltage.
+
+- max8997,pmic-buck125-default-dvs-idx: Default voltage setting selected from
+  the possible 8 options selectable by the dvs gpios. The value of this
+  property should be between 0 and 7. If not specified or if out of range, the
+  default value of this property is set to 0.
+
+- max8997,pmic-buck125-dvs-gpios: GPIO specifiers for three host gpio's used
+  for dvs. The format of the gpio specifier depends in the gpio controller.
+
+Regulators: The regulators of max8997 that have to be instantiated should be
+included in a sub-node named 'regulators'. Regulator nodes included in this
+sub-node should be of the format as listed below.
+
+   regulator_name {
+   standard regulator bindings here
+   };
+
+The following are the names of the regulators that the max8997 pmic block
+supports. Note: The 'n' in LDOn and BUCKn represents the LDO or BUCK number
+as per the datasheet of max8997.
+
+   - LDOn
+ - valid values for n are 1 to 18 and 21
+ - Example: LDO0, LD01, LDO2, LDO21
+   - BUCKn
+ - valid values for n are 1 to 7.
+ - Example: BUCK1, BUCK2, BUCK3, BUCK7
+
+   - ENVICHG: Battery Charging Current Monitor Output. This

Re: [PATCH 1/3] i2c: exynos5: add High Speed I2C controller driver

2012-11-27 Thread Thomas Abraham

> +   i2c_del_adapter(&i2c->adap);
> +   free_irq(i2c->irq, i2c);
> +
> +   clk_disable_unprepare(i2c->clk);
> +   clk_put(i2c->clk);
> +
> +   iounmap(i2c->regs);
> +
> +   release_resource(i2c->ioarea);
> +   exynos5_i2c_dt_gpio_free(i2c);
> +   kfree(i2c->ioarea);
> +
> +   return 0;
> +}
> +
> +#ifdef CONFIG_PM
> +static int exynos5_i2c_suspend_noirq(struct device *dev)
> +{
> +   struct platform_device *pdev = to_platform_device(dev);
> +   struct exynos5_i2c *i2c = platform_get_drvdata(pdev);
> +
> +   i2c_lock_adapter(&i2c->adap);
> +   i2c->suspended = 1;
> +   i2c_unlock_adapter(&i2c->adap);
> +
> +   return 0;
> +}
> +
> +static int exynos5_i2c_resume(struct device *dev)
> +{
> +   struct platform_device *pdev = to_platform_device(dev);
> +   struct exynos5_i2c *i2c = platform_get_drvdata(pdev);
> +
> +   i2c_lock_adapter(&i2c->adap);
> +   clk_prepare_enable(i2c->clk);
> +   exynos5_i2c_init(i2c);
> +   clk_disable_unprepare(i2c->clk);
> +   i2c->suspended = 0;
> +   i2c_unlock_adapter(&i2c->adap);
> +
> +   return 0;
> +}
> +
> +static const struct dev_pm_ops exynos5_i2c_dev_pm_ops = {
> +   .suspend_noirq  = exynos5_i2c_suspend_noirq,
> +   .resume_noirq   = exynos5_i2c_resume,
> +};
> +
> +#define EXYNOS5_DEV_PM_OPS (&exynos5_i2c_dev_pm_ops)
> +#else
> +#define EXYNOS5_DEV_PM_OPS NULL
> +#endif
> +
> +static struct platform_driver exynos5_i2c_driver = {
> +   .probe  = exynos5_i2c_probe,
> +   .remove = exynos5_i2c_remove,
> +   .id_table   = exynos5_driver_ids,
> +   .driver = {
> +   .owner  = THIS_MODULE,
> +   .name   = "exynos5-i2c",
> +   .pm = EXYNOS5_DEV_PM_OPS,
> +   .of_match_table = of_match_ptr(exynos5_i2c_match),
> +   },
> +};
> +
> +static int __init i2c_adap_exynos5_init(void)
> +{
> +   return platform_driver_register(&exynos5_i2c_driver);
> +}
> +subsys_initcall(i2c_adap_exynos5_init);
> +
> +static void __exit i2c_adap_exynos5_exit(void)
> +{
> +   platform_driver_unregister(&exynos5_i2c_driver);
> +}
> +module_exit(i2c_adap_exynos5_exit);
> +
> +MODULE_DESCRIPTION("Exynos5 HS-I2C Bus driver");
> +MODULE_AUTHOR("Taekgyun Ko, ");
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/i2c/busses/i2c-exynos5.h 
> b/drivers/i2c/busses/i2c-exynos5.h
> new file mode 100644
> index 000..063051e
> --- /dev/null
> +++ b/drivers/i2c/busses/i2c-exynos5.h
> @@ -0,0 +1,80 @@
> +/*
> + * Copyright (C) 2012 Samsung Electronics Co., Ltd.
> + *
> + * Exynos5 series HS-I2C Controller
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> +*/
> +
> +#ifndef __ASM_ARCH_REGS_HS_IIC_H
> +#define __ASM_ARCH_REGS_HS_IIC_H __FILE__
> +
> +/*
> + * Register Map
> + */
> +#define HSI2C_CTL  0x00
> +#define HSI2C_FIFO_CTL 0x04
> +#define HSI2C_TRAILIG_CTL  0x08
> +#define HSI2C_CLK_CTL  0x0C
> +#define HSI2C_CLK_SLOT 0x10
> +#define HSI2C_INT_ENABLE   0x20
> +#define HSI2C_INT_STATUS   0x24
> +#define HSI2C_ERR_STATUS   0x2C
> +#define HSI2C_FIFO_STATUS  0x30
> +#define HSI2C_TX_DATA  0x34
> +#define HSI2C_RX_DATA  0x38
> +#define HSI2C_CONF 0x40
> +#define HSI2C_AUTO_CONFING 0x44
> +#define HSI2C_TIMEOUT  0x48
> +#define HSI2C_MANUAL_CMD   0x4C
> +#define HSI2C_TRANS_STATUS 0x50
> +#define HSI2C_TIMING_HS1   0x54
> +#define HSI2C_TIMING_HS2   0x58
> +#define HSI2C_TIMING_HS3   0x5C
> +#define HSI2C_TIMING_FS1   0x60
> +#define HSI2C_TIMING_FS2   0x64
> +#define HSI2C_TIMING_FS3   0x68
> +#define HSI2C_TIMING_SLA   0x6C
> +#define HSI2C_ADDR 0x70
> +
> +/* I2C_CTL Register */
> +#define HSI2C_FUNC_MODE_I2C(1u << 0)
> +#define HSI2C_MASTER   (1u << 3)
> +#define HSI2C_RXCHON   (1u << 6)
> +#define HSI2C_TXCHON   (1u << 7)
> +
> +/* I2C_FIFO_CTL Register */
> +#define HSI2C_RXFIFO_EN(1u << 0)
> +#define HSI2C_TXFIFO_EN(1u << 1)
> +#define HSI2C_TXFIFO_TRIGGER_LEVEL (0x20 << 16)
> +#define HSI2C_RXFIFO_TRIGGER_LEVEL (0x20 << 4)
> +
> +/* I2C_TRAILING_CTL Register */
> +#define HSI2C_TRAILING_COUNT   (0xf)
> +
> +/* I2C_INT_EN Register */
> +#define HSI2C_INT_TX_ALMOSTEMPTY_EN(1u << 0) /* For TX FIFO */
> +#define HSI2C_INT_RX_ALMOSTFULL_EN (1u << 1) /* For RX FIFO */
> +#define HSI2C_INT_TRAILING_EN  (1u << 6)
> +
> +/* I2C_CONF Register */
> +#define HSI2C_AUTO_MODE(1u << 31)
> +#define HSI2C_10BIT_ADDR_MODE  (1u << 30)
> +#define HSI2C_HS_MODE  (1u << 29)
> +
> +/* I2C_AUTO_CONF Register */
> +#define HSI2C_READ_WRITE   (1u << 16)
> +#define HSI2C_STOP_AFTER_TRANS (1u << 17)
> +#define HSI2C_MASTER_RUN   (1u << 31)
> +
> +/* I2C_TIMEOUT Register */
> +#define HSI2C_TIMEOUT_EN   (1u << 31)
> +
> +#define HSI2C_FIFO_EMPTY   (0x1000100)
> +
> +#define HSI2C_FS_BPS   40
> +#define HSI2C_HS_BPS   250
> +
> +#endif /* __ASM_ARCH_REGS_HS_IIC_H */

Since these constants are only use in i2c-exynos5.c file, it is better
to move these definitions into i2c-exynos5.c file.

Thanks,
Thomas.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/6 v5] arm highbank: add support for pl320 IPC

2012-11-27 Thread Thomas Petazzoni

Dear Mark Langsdorf,

On Tue, 27 Nov 2012 09:04:32 -0600, Mark Langsdorf wrote:

> +int ipc_transmit(u32 *data);

ipc_transmit() looks to me like a way to generic name to be exposed to
the entire kernel.

> +extern int pl320_ipc_register_notifier(struct notifier_block *nb);
> +extern int pl320_ipc_unregister_notifier(struct notifier_block *nb);

Why some "extern" here? You don't have these for the other functions in
this header file.

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [regression] 3.7+ suspend to RAM/offline CPU fails with nmi_watchdog=0 (bisected)

2012-11-28 Thread Thomas Gleixner

On Wed, 28 Nov 2012, Joseph Salisbury wrote:
> On 11/23/2012 08:11 AM, Norbert Warmuth wrote:
> > Thomas Gleixner writes:
> > > On Wed, 21 Nov 2012, Norbert Warmuth wrote:
> > > > 3.7-rc6 booted with nmi_watchdog=0 fails to suspend to RAM or
> > > > offline CPUs. It's reproducable with a KVM guest and physical
> > > > system.
> > > Does the patch below fix it?
> > Yes.
> > 
> > - Norbert
> > 
> > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> > > index 9d4c8d5..e3ef521 100644
> > > --- a/kernel/watchdog.c
> > > +++ b/kernel/watchdog.c
> > > @@ -368,6 +368,9 @@ static void watchdog_disable(unsigned int cpu)
> > >   {
> > >   struct hrtimer *hrtimer = &__raw_get_cpu_var(watchdog_hrtimer);
> > >   +   if (!watchdog_enabled)
> > > + return;
> > > +
> > >   watchdog_set_prio(SCHED_NORMAL, 0);
> > >   hrtimer_cancel(hrtimer);
> > >   /* disable the perf event */
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> Hi Thomas,
> 
> Your patch also fixes a bug[0] reported against Ubuntu.  I assume the window
> for v3.7 is closed.  Will you be submitting this patch for inclusion in v3.8?

Sure, with a stable tag so it gets back into 3.7

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v2 8/8] drm: tegra: Add gr2d device

2012-11-28 Thread Thomas Hellstrom


On 11/28/2012 02:33 PM, Lucas Stach wrote:

Am Mittwoch, den 28.11.2012, 15:17 +0200 schrieb Terje Bergström:

On 28.11.2012 01:00, Dave Airlie wrote:

We  generally aim for the first, to stop the gpu from reading/writing
any memory it hasn't been granted access to,
the second is nice to have though, but really requires a GPU with VM
to implement properly.

I wonder if we should aim at root only access on Tegra20, and force
IOMMU on Tegra30 and fix the remaining issues we have with IOMMU. The
firewall turns out to be more complicated than I wished.

Biggest problem is that we aim at zero-copy for everything possible,
including command streams. Kernel gets a handle to a command stream, but
the command stream is allocated by the user space process. So the user
space can tamper with the stream once it's been written to the host1x 2D
channel.


So this is obviously wrong. Userspace has to allocate a pushbuffer from
the kernel just as every other buffer, then map it into it's own address
space to push in commands. At submit time of the pushbuf kernel has to
make sure that userspace is not able to access the memory any more, i.e.
kernel shoots down the vma or pagetable of the vma.


To me this sounds very expensive. Zapping the page table requires a CPU 
TLB flush

on all cores that have touched the buffer, not to mention the kernel calls
required to set up the page table once the buffer is reused.

If this usage scheme then is combined with a command verifier or 
"firewall" that
reads from a *write-combined* pushbuffer performance will be bad. Really 
bad.


In such situations I think one should consider copy-from-user while 
validating, and

let user-space set up the command buffer in malloced memory.

/Thomas



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3.6.7-rt18 ARM BUG_ON() at kernel/sched/core.c:3817

2012-11-30 Thread Thomas Gleixner



On Wed, 28 Nov 2012, Frank Rowand wrote:

> 3.6.7-rt18:  kernel BUG at .../kernel/sched/core.c:3817!
> 
> Grant reported this same problem for 3.6.5-rt15.
> 
> I am seeing it on a different arm board.
> 
> Here is the BUG_ON():
> 
>asmlinkage void __sched preempt_schedule_irq(void)
>{
> struct thread_info *ti = current_thread_info();
> 
> /* Catch callers which need to be fixed */
> BUG_ON(ti->preempt_count || !irqs_disabled());
> 
> Putting in some extra printk(), the BUG_ON() is triggering because
> ti->preempt_count is non-zero.
> 
> 
> It appears that the cause is in arch/arm/kernel/entry-armv.S.
> 
> The call to preempt_schedule_irq() is from svc_preempt:
> 
>#ifdef CONFIG_PREEMPT
>svc_preempt:
>mov r8, lr
>1:  bl  preempt_schedule_irq@ irq en/disable is done 
> inside
> 
> 
> svc_preempt is branched to from one of two possible places.  The first was
> present before the lazy preempt code was added.  The first appears ok to me.
> (Note that the first branch does not occur if preempt count is non-zero.)
> 
> The second branch can occur even if the preempt count is non-zero (which is
> what the BUG_ON() is finding):
> 
>__irq_svc:
>svc_entry
>irq_handler
> 
>#ifdef CONFIG_PREEMPT
>get_thread_info tsk
>ldr r8, [tsk, #TI_PREEMPT]  @ get preempt count
>ldr r0, [tsk, #TI_FLAGS]@ get flags
>teq r8, #0  @ if preempt count != 0
>movne   r0, #0  @ force flags to 0
>tst r0, #_TIF_NEED_RESCHED
>blnesvc_preempt
>ldr r8, [tsk, #TI_PREEMPT_LAZY] @ get preempt lazy count
>ldr r0, [tsk, #TI_FLAGS]@ get flags
>teq r8, #0  @ if preempt lazy count != > 0
>movne   r0, #0  @ force flags to 0
>tst r0, #_TIF_NEED_RESCHED_LAZY
>blnesvc_preempt
>#endif

Bah. I knew that I had messed up the ASM magic.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.9-rc1] irq 16: nobody cared (was [3.9-rc1] very poor interrupt responses)

2013-03-18 Thread Thomas Meyer

My laptop is an Acer 1810T. I see this error message each boot.

Kind regards
Thomas

Jiri Kosina  schrieb:

>On Fri, 15 Mar 2013, Jiri Kosina wrote:
>
>> > I have the same problem on my Lenovo T500. I think the graphics card is
>> > involved.
>> > 
>> > This laptop has "hybrid graphics" - one Intel GMA 4500MHD and one ATI
>> > Mobility Radeon HD 3650. When I boot with the Intel card, I get "irq 16:
>> > nobody cared" during boot, not when I boot with the ATI card.
>> 
>> Confirming this. After a lot of hassle, I have bisected this reliably to
>> 
>>  commit 28c70f162a315bdcfbe0bf940a740ef8bfb918d6
>>  Author: Daniel Vetter 
>>  Date:   Sat Dec 1 13:53:45 2012 +0100
>> 
>>  drm/i915: use the gmbus irq for waits
>> 
>> Adding Daniel, Imre and Daniel to CC while I will try to figure out what's 
>> happening in parallel.
>> 
>> Attaching dmesg.txt from the machine with 28c70f162a as head, with 
>> drm.debug=0xe.
>
>Just a datapoint -- I have put a trivial debugging patch in place, and it 
>reveals that "nobody cared" for irq 16 happens long after last
>
>   I915_WRITE(GMBUS4 + reg_offset, 0);
>
>has been performed in gmbus_wait_hw_status(). On the other hand, if I 
>comment out both GMBUS4 register offset writes in gmbus_wait_hw_status(), 
>then it of course falls back to GPIO bit-banging, but the "nobody cared" 
>for irq 16 is gone. 
>
>So it seems like something gets severely confused by the I915_WRITE to 
>GMBUS4 + reg_offset. So far this seems to have been reported solely on 
>Lenovos as far as I can see (although a completely different types), so it 
>might be some platform-specific quirk?
>
>Honestly, I still don't understand how all the GMBUS stuff relates to IRQ 
>16 at all. 
>
>-- 
>Jiri Kosina
>SUSE Labs

Re: [RFC PATCH v2] of/pci: Provide support for parsing PCI DT ranges property

2013-03-20 Thread Thomas Petazzoni

Hello,

What is the status of the below patch? Both Marvell PCIe driver and
Tegra PCIe driver need a way to parse the ranges = <...> property of
the PCI DT node.

Would it be possible to get this patch merged for 3.10, or get some
review comments that would allow us to rework it in time for 3.10 ?

Thanks,

Thomas

On Thu, 21 Feb 2013 15:47:09 +, Andrew Murray wrote:
> DT bindings for PCI host bridges often use the ranges property to describe
> memory and IO ranges - this binding tends to be the same across architectures
> yet several parsing implementations exist, e.g. arch/mips/pci/pci.c,
> arch/powerpc/kernel/pci-common.c, arch/sparc/kernel/pci.c and
> arch/microblaze/pci/pci-common.c (clone of PPC). Some of these duplicate
> functionality provided by drivers/of/address.c.
> 
> This patch factors out common implementations patterns to reduce overall 
> kernel
> code and provide a means for host bridge drivers to directly obtain struct
> resources from the DT's ranges property without relying on architecture 
> specific
> DT handling. This will make it easier to write archiecture independent host 
> bridge
> drivers and mitigate against further duplication of DT parsing code.
> 
> This patch can be used in the following way:
> 
>   struct of_pci_range_iter iter;
>   for_each_of_pci_range(&iter, np) {
> 
>   //directly access properties of the address range, e.g.:
>   //iter.pci_space, iter.pci_addr, iter.cpu_addr, iter.size or
>   //iter.flags
> 
>   //alternatively obtain a struct resource, e.g.:
>   //struct resource res;
>   //range_iter_fill_resource(iter, np, res);
>   }
> 
> Additionally the implementation takes care of adjacent ranges and merges them
> into a single range (as was the case with powerpc and microblaze).
> 
> The modifications to microblaze, mips and powerpc have not been tested.
> 
> v2:
>   - This follows on from suggestions made by Grant Likely
> (marc.info/?l=linux-kernel&m=136079602806328)
> 
> Signed-off-by: Andrew Murray 
> Signed-off-by: Liviu Dudau 
> ---
>  arch/microblaze/pci/pci-common.c |  100 +++--
>  arch/mips/pci/pci.c  |   44 -
>  arch/powerpc/kernel/pci-common.c |   93 ++-
>  drivers/of/address.c |   54 
>  include/linux/of_address.h   |   30 +++
>  5 files changed, 151 insertions(+), 170 deletions(-)
> 
> diff --git a/arch/microblaze/pci/pci-common.c 
> b/arch/microblaze/pci/pci-common.c
> index 4dbb505..ccc0d63 100644
> --- a/arch/microblaze/pci/pci-common.c
> +++ b/arch/microblaze/pci/pci-common.c
> @@ -659,67 +659,37 @@ void __devinit pci_process_bridge_OF_ranges(struct 
> pci_controller *hose,
>   struct device_node *dev,
>   int primary)
>  {
> - const u32 *ranges;
> - int rlen;
> - int pna = of_n_addr_cells(dev);
> - int np = pna + 5;
>   int memno = 0, isa_hole = -1;
> - u32 pci_space;
> - unsigned long long pci_addr, cpu_addr, pci_next, cpu_next, size;
>   unsigned long long isa_mb = 0;
>   struct resource *res;
> + struct of_pci_range_iter iter;
>  
>   printk(KERN_INFO "PCI host bridge %s %s ranges:\n",
>  dev->full_name, primary ? "(primary)" : "");
>  
> - /* Get ranges property */
> - ranges = of_get_property(dev, "ranges", &rlen);
> - if (ranges == NULL)
> - return;
> -
> - /* Parse it */
>   pr_debug("Parsing ranges property...\n");
> - while ((rlen -= np * 4) >= 0) {
> + for_each_of_pci_range(&iter, dev) {
>   /* Read next ranges element */
> - pci_space = ranges[0];
> - pci_addr = of_read_number(ranges + 1, 2);
> - cpu_addr = of_translate_address(dev, ranges + 3);
> - size = of_read_number(ranges + pna + 3, 2);
> -
>   pr_debug("pci_space: 0x%08x pci_addr:0x%016llx "
>   "cpu_addr:0x%016llx size:0x%016llx\n",
> - pci_space, pci_addr, cpu_addr, size);
> -
> - ranges += np;
> + iter.pci_space, iter.pci_addr, iter.cpu_addr,
> + iter.size);
>  
>   /* If we failed translation or got a zero-sized region
>* (some FW try to feed us with non sensical zero sized regions
>* such as power3 which look like some kind of attem

Re: [PATCH 2/2] netlink: Diag core and basic socket info dumping

2013-03-21 Thread Thomas Graf

On 03/21/13 at 01:21pm, Andrey Vagin wrote:
> diff --git a/include/uapi/linux/netlink_diag.h 
> b/include/uapi/linux/netlink_diag.h
> new file mode 100644
> index 000..9328866
> --- /dev/null
> +++ b/include/uapi/linux/netlink_diag.h
> +enum {
> + NETLINK_DIAG_MEMINFO,
> + NETLINK_DIAG_GROUPS,
> +
> + NETLINK_DIAG_MAX,
> +};

Please follow the common pattern and define NETLINK_DIAG_MAX as
NETLINK_DIAG_GROUPS like other by doing>

[...] 
__NETLINK_DIAG_MAX,
};
 
#define NETLINK_DIAG_MAX (__NETLINK_DIAG_MAX - 1)
 
Everyone is used to do:
 
struct nlattr *attrs[NETLINK_DIAG_MAX+1];
 
nla_parse([...], NETLINK_DIAG_MAX, [...]

In fact, the follow-up patch to ss is buggy because of this.
UNIX_DIAG_MAX suffers from the same problem which is problem the
cause for this.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] netlink: Diag core and basic socket info dumping

2013-03-21 Thread Thomas Graf

On 03/21/13 at 06:31pm, Andrew Vagin wrote:
> The code in ss looks like you described:
> struct rtattr *tb[UNIX_DIAG_MAX+1];
> ...
> parse_rtattr(tb, UNIX_DIAG_MAX, (struct rtattr*)(r+1),
>  nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)));
> 
> 
> struct rtattr *tb[NETLINK_DIAG_MAX+1];
> ...
> parse_rtattr(tb, NETLINK_DIAG_MAX, (struct rtattr*)(r+1),
>  nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)))
> 
> I think I should only update headers... Or I don't understand something.

Right, fixing the headers will resolve the issue.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net: fix *_DIAG_MAX constants

2013-03-21 Thread Thomas Graf

On 03/21/13 at 06:18pm, Andrey Vagin wrote:
> Follow the common pattern and define *_DIAG_MAX like:
> 
> [...]
> __XXX_DIAG_MAX,
> };
> 
> Because everyone is used to do:
> 
> struct nlattr *attrs[XXX_DIAG_MAX+1];
> 
> nla_parse([...], XXX_DIAG_MAX, [...]
> 
> Reported-by: Thomas Graf 
> Cc: "David S. Miller" 
> Cc: Pavel Emelyanov 
> Cc: Eric Dumazet 
> Cc: "Paul E. McKenney" 
> Cc: David Howells 
> Signed-off-by: Andrey Vagin 

Acked-by: Thomas Graf 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] net: fix *_DIAG_MAX constants

2013-03-21 Thread Thomas Graf

On 03/21/13 at 11:14am, David Miller wrote:
> So you're ACK'ing a patch that makes changes to files that don't even
> exist in the repository?

I have been ACK'ing the patch in the context of the previous
patch that I reviewed in the first place which in summary is
now OK. But you are obviously right that a fixed version of
the initial patch should be submitted instead.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH RESEND v2] of/pci: Provide support for parsing PCI DT ranges property

2013-03-21 Thread Thomas Petazzoni

Dear Andrew Murray,

On Fri,  1 Mar 2013 12:23:36 +, Andrew Murray wrote:
> This patch factors out common implementations patterns to reduce overall 
> kernel
> code and provide a means for host bridge drivers to directly obtain struct
> resources from the DT's ranges property without relying on architecture 
> specific
> DT handling. This will make it easier to write archiecture independent host 
> bridge
> drivers and mitigate against further duplication of DT parsing code.
> 
> This patch can be used in the following way:
> 
>   struct of_pci_range_iter iter;
>   for_each_of_pci_range(&iter, np) {
> 
>   //directly access properties of the address range, e.g.:
>   //iter.pci_space, iter.pci_addr, iter.cpu_addr, iter.size or
>   //iter.flags
> 
>   //alternatively obtain a struct resource, e.g.:
>   //struct resource res;
>   //range_iter_fill_resource(iter, np, res);
>   }
> 
> Additionally the implementation takes care of adjacent ranges and merges them
> into a single range (as was the case with powerpc and microblaze).
> 
> The modifications to microblaze, mips and powerpc have not been tested.
> 
> v2:
>   This follows on from suggestions made by Grant Likely
>   (marc.info/?l=linux-kernel&m=136079602806328)
> 
> Signed-off-by: Andrew Murray 
> Signed-off-by: Liviu Dudau 

Thanks, I've tested this successfully with the Marvell PCIe driver. I'm
about to send a new version of the Marvell PCIe patch set that includes
this RFC proposal.

I only made two small changes compared to your version, detailed below.


> +#define for_each_of_pci_range(iter, np) \
> + for (; of_pci_process_ranges(iter, np);)

In the initial part of the loop, I added a memset() to initialize to
zero the "iter" structure. Otherwise, if you forget to do it before
calling of_pci_process_ranges(), it may crash (depending on the random
values present in the uninitialized structure).

> +#define range_iter_fill_resource(iter, np, res) \
> + do { \
> + res->flags = iter.flags; \
> + res->start = iter.cpu_addr; \
> + res->end = iter.cpu_addr + iter.size - 1; \
> + res->parent = res->child = res->sibling = NULL; \
> + res->name = np->full_name; \
> + } while (0)

And here, I enclosed all the usage of the macro parameters in
parenthesis. Like (res)->flags instead of res->flags. If you don't do
that, then passing &foobar as the 'res' parameter causes some
compilation failure because &foobar->res is not valid, while
(&foobar)->res is.

Best regards,

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: hrtimer possible issue

2013-02-04 Thread Thomas Gleixner

On Sun, 3 Feb 2013, Izik Eidus wrote:

> Hi,
> 
> it seems like hrtimer_enqueue_reprogram contain a race which could result in
> timer.base switch during unlock/lock sequence.
> 
> See the code at __hrtimer_start_range_ns where it calls
> hrtimer_enqueue_reprogram. The later is releasing lock protecting the timer
> base for a short time and timer base switch can occur from a different CPU
> thread. Later when __hrtimer_start_range_ns calls unlock_hrtimer_base, a base
> switch could have happened and this causes the bug
> 
> Try to start the same hrtimer from two different threads in kernel running
> each one on a different CPU. Eventually one of the calls will cause timer base
> switch while another thread is not expecting it.
> 
> This can happen in virtualized environment where one thread can be delayed by
> lower hypervisor, and due to time delay a different CPU is taking care of
> missed timer start and runs the timer start logic on its own.

Nice analysis.
 
> This simple patch (just to give example of a fix) refactor this function to
> get rid of unneeded lock which immediately was followed by the unlock (with
> possible undesired base switch).
> 
> (Both the bug and the fixed were found/patched by Leonid Shatz)

The patch got mangled by your mail client and it is missing the proper
Signed-off-by annotation in the patch description. See
Documentation/SubmittingPatches.

Can you please resend ?

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] OF: Fixup resursive locking code paths

2013-02-04 Thread Thomas Gleixner

On Fri, 25 Jan 2013, Paul Gortmaker wrote:

> From: Thomas Gleixner 
> 
> There is no real reason to use a rwlock for devtree_lock. It even
> could be a mutex, but unfortunately it's locked from cpu hotplug
> paths which can't schedule :(
> 
> So it needs to become a raw lock on rt as well.  The devtree_lock would
> be the only user of a raw_rw_lock, so we are better off cleaning up the
> recursive locking paths which allows us to convert devtree_lock to a
> read_lock.

Hmm. It's already a rw_lock. For RT we want to change that thing to a
raw_spinlock.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[ANNOUNCE] 3.6.11-rt26

2013-02-04 Thread Thomas Gleixner

Dear RT Folks,

I'm pleased to announce the 3.6.11-rt26 release.

Changes since 3.6.11-rt25:

   1) Fix the RT highmem implementation on x86

   2) Support highmem + RT on ARM

   3) Fix an one off error in the generic highmem code (upstream fix
  did not make it into 3.6.stable)

   4) Upstream SLUB fixes (Christoph Lameter)

   5) Fix a few RT issues in mmc and amba drivers

   6) Initialize local locks in mm/swap.c early

   7) Use simple wait queues for completions. This is a performance
  improvement.

  Completions do not have complex callbacks and the wakeup path is
  disabling interrupts anyway. So using simple wait locks with the
  raw spinlock is not a latency problem, but the "sleeping lock"
  in the normal waitqueue is a source for lock bouncing:

  T1   T2
  lock(WQ)
  wakeup(T2)
  ---> preemption
   lock(WQ)
   pi_boost(T1)
   wait_for_lock(WQ)
  unlock(WQ)
  deboost(T1)
  ---> preemption
   

  The simple waitqueue reduces this to:
   
  T1   T2
  raw_lock(WQ)
  wakeup(T2)
  raw_unlock(WQ)
  ---> preemption
   raw_lock(WQ) 
   

@Steven: Sorry, I forgot the stable tags on:
 drivers-tty-pl011-irq-disable-madness.patch
 mmci-remove-bogus-irq-save.patch
 idle-state.patch
 might-sleep-check-for-idle.patch
 mm-swap-fix-initialization.patch

I'm still digging through my mail backlog, so I have not yet decided
whether this is the last RT release for 3.6.


The delta patch against 3.6.11-rt25 is appended below and can be found
here:

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/incr/patch-3.6.11-rt25-rt26.patch.xz

The RT patch against 3.6.11 can be found here:

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patch-3.6.11-rt26.patch.xz

The split quilt queue is available at:

  
http://www.kernel.org/pub/linux/kernel/projects/rt/3.6/patches-3.6.11-rt26.tar.xz

Enjoy,

tglx

->
Index: linux-stable/arch/arm/Kconfig
===
--- linux-stable.orig/arch/arm/Kconfig
+++ linux-stable/arch/arm/Kconfig
@@ -1749,7 +1749,7 @@ config HAVE_ARCH_PFN_VALID
 
 config HIGHMEM
bool "High Memory Support"
-   depends on MMU && !PREEMPT_RT_FULL
+   depends on MMU
help
  The address space of ARM processors is only 4 Gigabytes large
  and it has to accommodate user address space, kernel address
Index: linux-stable/arch/x86/mm/highmem_32.c
===
--- linux-stable.orig/arch/x86/mm/highmem_32.c
+++ linux-stable/arch/x86/mm/highmem_32.c
@@ -21,6 +21,7 @@ void kunmap(struct page *page)
 }
 EXPORT_SYMBOL(kunmap);
 
+#ifndef CONFIG_PREEMPT_RT_FULL
 /*
  * kmap_atomic/kunmap_atomic is significantly faster than kmap/kunmap because
  * no global lock is needed and because the kmap code must perform a global TLB
@@ -115,6 +116,7 @@ struct page *kmap_atomic_to_page(void *p
return pte_page(*pte);
 }
 EXPORT_SYMBOL(kmap_atomic_to_page);
+#endif
 
 void __init set_highmem_pages_init(void)
 {
Index: linux-stable/include/linux/wait-simple.h
===
--- linux-stable.orig/include/linux/wait-simple.h
+++ linux-stable/include/linux/wait-simple.h
@@ -22,12 +22,14 @@ struct swait_head {
struct list_headlist;
 };
 
-#define DEFINE_SWAIT_HEAD(name)\
-   struct swait_head name = {  \
+#define SWAIT_HEAD_INITIALIZER(name) { \
.lock   = __RAW_SPIN_LOCK_UNLOCKED(name.lock),  \
.list   = LIST_HEAD_INIT((name).list),  \
}
 
+#define DEFINE_SWAIT_HEAD(name)\
+   struct swait_head name = SWAIT_HEAD_INITIALIZER(name)
+
 extern void __init_swait_head(struct swait_head *h, struct lock_class_key 
*key);
 
 #define init_swait_head(swh)   \
@@ -40,59 +42,25 @@ extern void __init_swait_head(struct swa
 /*
  * Waiter functions
  */
-static inline bool swaiter_enqueued(struct swaiter *w)
-{
-   return w->task != NULL;
-}
-
+extern void swait_prepare_locked(struct swait_head *head, struct swaiter *w);
 extern void swait_prepare(struct swait_head *head, struct swaiter *w, int 
state);
+extern void swait_finish_locked(struct swait_head *head, struct swaiter *w);
 extern void swait_finish(struct swait_head *head, struct swaiter *w);
 
 /*
- * Adds w to head->list. Must be called with head->lock locked.
- */
-static inline void __swait_enqueue(struct swait_head *head, struct swaiter *w)
-{
-   list_add(&w->node, &head->list);
-}
-
-/*
- * Removes w from

Re: [ANNOUNCE] 3.6.11-rt26

2013-02-04 Thread Thomas Gleixner

On Mon, 4 Feb 2013, Thomas Gleixner wrote:
> Dear RT Folks,
> 
> I'm pleased to announce the 3.6.11-rt26 release.
> 
> Changes since 3.6.11-rt25:

Forgot to mention the change from EXPORT_SYMBOL_GPL to EXPORT_SYMBOL
for pagefault_dis/enable. I really hate it, but it breaks the
compilation of ^!@%^$@ drivers which work fine against mainline. Sigh!

@Steven: Wants to go into stable-rt as well

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANNOUNCE] 3.6.11-rt26

2013-02-04 Thread Thomas Gleixner

On Mon, 4 Feb 2013, Clark Williams wrote:
> More changes; I was running into a collision with the name kmap_prot.

Bah. I knew that I should have decided that today is still part of the
weekend.

Pushed out rt27 with the fixed merged back. Sorry for the noise.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANNOUNCE] 3.6.11-rt26

2013-02-05 Thread Thomas Gleixner

On Tue, 5 Feb 2013, Qiang Huang wrote:
> On 2013/2/4 22:58, Thomas Gleixner wrote:
> >From patches-3.6.11-rt28.patch.gz, your patch x86-highmem-make-it-work.patch
> did this work. And you said
> "It had been enabled quite some time, but never really worked."
> 
> But I think there is a previous patch mm-rt-kmap-atomic-scheduling.patch did
> the job, so I think RT highmem on x86 should have worked.
> 
> Now with your patch, if we use kmap instead of kmap_atomic on RT, do we need
> to revert Peter's patch as well?

I should have done that, yes.
 
> I haven't tested it, but if Peter's patch did solved the problem, is his way
> better than use kmap? Because we can use more highmem virtual address,
> although with some switch latency in some small probability scenarios.

In theory it's better. Though I ran into some issues with that
approach. It's on my todo list to revisit that problem, but for now
the kmap way is at least safer.

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: hrtimer possible issue

2013-02-05 Thread Thomas Gleixner

On Mon, 4 Feb 2013, Leonid Shatz wrote:

> I assume the race can also happen between hrtimer cancel and start. In both
> cases timer base switch can happen.
> 
> Izik, please check if you can arrange the patch in the standard format (do
> we need to do it against latest kernel version?)

Yes please.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fix hrtimer_enqueue_reprogram race

2013-02-05 Thread Thomas Gleixner

On Mon, 4 Feb 2013, Izik Eidus wrote:

> From: leonid Shatz 
> 
> it seems like hrtimer_enqueue_reprogram contain a race which could result in
> timer.base switch during unlock/lock sequence.
> 
> See the code at __hrtimer_start_range_ns where it calls
> hrtimer_enqueue_reprogram. The later is releasing lock protecting the timer
> base for a short time and timer base switch can occur from a different CPU
> thread. Later when __hrtimer_start_range_ns calls unlock_hrtimer_base, a base
> switch could have happened and this causes the bug
> 
> Try to start the same hrtimer from two different threads in kernel running
> each one on a different CPU. Eventually one of the calls will cause timer base
> switch while another thread is not expecting it.

Aside of the bug in the hrtimer code being a real one, writing code
which fiddles with the same resource (hrtimer) unserialized is broken
on its own.
 
> This can happen in virtualized environment where one thread can be delayed by
> lower hypervisor, and due to time delay a different CPU is taking care of
> missed timer start and runs the timer start logic on its own.

Without noticing that something else already takes care of it? So
you're saying that the code in question relies on magic serialization
in the hrtimer code. Doesn't look like a brilliant design.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: clock_nanosleep() task_struct leak

2013-02-05 Thread Thomas Gleixner

On Tue, 5 Feb 2013, Stanislaw Gruszka wrote:
> On Mon, Feb 04, 2013 at 08:32:23PM +0100, Oleg Nesterov wrote:
> > On 02/01, Thomas Gleixner wrote:
> > >
> > > B1;2601;0cOn Fri, 1 Feb 2013, Tommi Rantala wrote:
> > >
> > > > Hello,
> > > >
> > > > Trinity discovered a task_struct leak with clock_nanosleep(), 
> > > > reproducible with:
> > > >
> > > > -8<-8<-8<-
> > > > #include 
> > > >
> > > > static const struct timespec req;
> > > >
> > > > int main(void) {
> > > > return clock_nanosleep(CLOCK_PROCESS_CPUTIME_ID,
> > > > TIMER_ABSTIME, &req, NULL);
> > > > }
> > > > -8<-8<-8<-
> > 
> > posix_cpu_timer_create()->get_task_struct() I guess...
> > 
> > Cough. I am not sure I ever understood this code, but now it certainly
> > looks as if I never saw it before.
> 
> Looks on do_cpu_nanosleep() we call posix_cpu_timer_create(), but we do
> not call posix_cpu_timer_del() at the end. Fix will not be super simple,
> since we need to care about error cases. I can cook a patch if nobody
> else want to do this.

Would be much appreciated!

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] fix hrtimer_enqueue_reprogram race

2013-02-05 Thread Thomas Gleixner

Leonid,

On Tue, 5 Feb 2013, Leonid Shatz wrote:

Please stop top posting!

> The explanation were submitted as possible scenario which could explain how
> the bug in kernel could happen and it does not mean that serious designer
> could do exactly that. As I said before, it's also possible that a race
> between hrtimer_cancel and hrtimer_start can trigger the bug. The idea is to
> have kernel more robust.

I'm not against making the kernel more robust and I already applied
the patch.

> There are already locks used inside hrtimer code, so why should
> users of the hrtimer add another layer of locks and get involved in
> the intricacy of which cases are protected by internal hrtimer lock
> and which are not?

Groan. The hrtimer locks are there to protect the internal data
structures of the hrtimer code and to ensure that hrtimer functions
are proper protected against concurrent running callbacks. But that
does not give you any kind of protection versus multiple users of your
hrtimer resource.

Look at the following scenario:

CPU0CPU1

hrtimer_cancel()
hrtimer_start()
teardown_crap()
hrtimer_callback() runs

That's probably not what you want and magic serialization in the
hrtimer code does not help at all.

There is also no protection against:

CPU0CPU1

hrtimer_cancel()
hrtimer_start()
hrtimer_forward()

Which leaves the hrtimer enqueued on CPU1 with a wrong expiry value.

So while concurrent hrtimer_start() is protected, other things are
not. So do we need to create a list of functions which can be abused
by a programmer without proper protection of the resource and which
not?

If you want to use any kind of resource (including hrtimers)
concurrently you better have proper serialization in that
code. Everything else is voodoo programming of the worst kind.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 06/14] ARM: pci: Keep pci_common_init() around after init

2013-02-06 Thread Thomas Petazzoni

Dear Thierry Reding,

On Wed,  9 Jan 2013 21:43:06 +0100, Thierry Reding wrote:
> When using deferred driver probing, PCI host controller drivers may
> actually require this function after the init stage.
> 
> Signed-off-by: Thierry Reding 

Tested-by: Thomas Petazzoni 
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 05/32] lib: devres: don't enclose pcim_*() functions in CONFIG_HAS_IOPORT

2013-02-12 Thread Thomas Petazzoni

The pcim_*() functions are used by the libata-sff subsystem, and this
subsystem is used for many SATA drivers on ARM platforms that do not
necessarily have I/O ports.

Signed-off-by: Thomas Petazzoni 
Cc: Paul Gortmaker 
Cc: Jesse Barnes 
Cc: Yinghai Lu 
Cc: linux-kernel@vger.kernel.org
---
 lib/devres.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/devres.c b/lib/devres.c
index 80b9c76..5639c3e 100644
--- a/lib/devres.c
+++ b/lib/devres.c
@@ -195,6 +195,7 @@ void devm_ioport_unmap(struct device *dev, void __iomem 
*addr)
   devm_ioport_map_match, (void *)addr));
 }
 EXPORT_SYMBOL(devm_ioport_unmap);
+#endif /* CONFIG_HAS_IOPORT */
 
 #ifdef CONFIG_PCI
 /*
@@ -400,4 +401,3 @@ void pcim_iounmap_regions(struct pci_dev *pdev, int mask)
 }
 EXPORT_SYMBOL(pcim_iounmap_regions);
 #endif /* CONFIG_PCI */
-#endif /* CONFIG_HAS_IOPORT */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 05/32] lib: devres: don't enclose pcim_*() functions in CONFIG_HAS_IOPORT

2013-02-12 Thread Thomas Petazzoni

Dear Arnd Bergmann,

On Tue, 12 Feb 2013 18:00:48 +, Arnd Bergmann wrote:
> On Tuesday 12 February 2013, Thomas Petazzoni wrote:
> > The pcim_*() functions are used by the libata-sff subsystem, and
> > this subsystem is used for many SATA drivers on ARM platforms that
> > do not necessarily have I/O ports.
> > 
> > Signed-off-by: Thomas Petazzoni
> >  Cc: Paul Gortmaker
> >  Cc: Jesse Barnes
> >  Cc: Yinghai Lu 
> > Cc: linux-kernel@vger.kernel.org
> 
> Sorry, but this patch is still incorrect.

I know, but the discussion was so huge on the first posting that it was
basically impossible to draw a conclusion out of it.

> Any driver that requires a
> linear mapping of I/O ports to __iomem pointers must depend
> CONFIG_HAS_IOPORT with the current definition of that symbol (as
> mentioned before, we should really rename that to
> CONFIG_HAS_IOPORT_MAP). Having these functions not defined is a
> compile time check that is necessary to ensure that all drivers have
> the correct annotation.

I have the feeling that the problem is more complex than that. My
understanding is that the pcim_iomap_regions() function used by
drivers/ata/libata-sff.c can perfectly be used to map memory BARs, and
not necessarily I/O BARs. Therefore, this driver can perfectly be used
in an architecture where CONFIG_NO_IOPORT is selected.

The thing is that pcim_iomap_regions() transparently allows to remap an
I/O BAR is such a BAR is passed as argument, or a memory BAR if such a
BAR is passed as argument.

Therefore, I continue to believe that the pcim_*() functions are useful
even if the platform doesn't have CONFIG_HAS_IOPORT.

Best regards,

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2 2/2] cpufreq: AMD "frequency sensitivity feedback" powersave bias for ondemand governor

2013-04-02 Thread Thomas Renninger

On Thursday, March 28, 2013 01:24:17 PM Jacob Shin wrote:
> Future AMD processors, starting with Family 16h, can provide software
> with feedback on how the workload may respond to frequency change --
> memory-bound workloads will not benefit from higher frequency, where
> as compute-bound workloads will. This patch enables this "frequency
> sensitivity feedback" to aid the ondemand governor to make better
> frequency change decisions by hooking into the powersave bias.
If I read this correctly, nothing changes even if the driver is loaded,
unless user modifies:
/sys/devices/system/cpu/cpufreq/ondemand/powersave_bias
is this correct?

I wonder who should modify:
/sys/devices/system/cpu/cpufreq/ondemand/powersave_bias
Even cpupower is not aware of this very specific tunable.

Also, are you sure cpufreq subsystem will be the only user
of this one?
Or could cpuidle or others also make use of this somewhen in the future?

Then this could more be done like:
drivers/cpufreq/mperf.c
And scheduler, cpuidle, cpufreq or whatever could use this as well.

Just some thinking:
I wonder how one could check/verify that the right thing is done
(by CPU and kernel). Ideally it would be nice to have the CPU register
appended to a cpufreq or cpuidle event trace.
But this very (AMD or X86 only?) specific data would not look nice there.
An arch placeholder value would be needed or similar?

...
> +}
> +
> +static int __init amd_freq_sensitivity_init(void)
> +{
> + int i;
> + u32 eax, edx, dummy;
> +
> + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
> + return -ENODEV;
> +
> + cpuid(0x8007, &eax, &dummy, &dummy, &edx);
If this really should be a separate module:
Does/will Intel have the same (feature/cpuid bit)?
Anyway, this should get a general AMD or X86 CPU capability flag.

Then you can also autoload this driver similar to how it's done in acpi-
cpufreq:
static const struct x86_cpu_id acpi_cpufreq_ids[] = {
X86_FEATURE_MATCH(X86_FEATURE_ACPI),
X86_FEATURE_MATCH(X86_FEATURE_HW_PSTATE),
{}
};
MODULE_DEVICE_TABLE(x86cpu, acpi_cpufreq_ids);

   Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V3 2/2] cpufreq: AMD "frequency sensitivity feedback" powersave bias for ondemand governor

2013-04-02 Thread Thomas Renninger

On Tuesday, April 02, 2013 03:03:37 PM Jacob Shin wrote:
> On Tue, Apr 02, 2013 at 09:23:52PM +0200, Borislav Petkov wrote:
> > On Tue, Apr 02, 2013 at 01:11:44PM -0500, Jacob Shin wrote:
> > > Future AMD processors, starting with Family 16h, can provide software
> > > with feedback on how the workload may respond to frequency change --
> > > memory-bound workloads will not benefit from higher frequency, where
> > > as compute-bound workloads will. This patch enables this "frequency
> > > sensitivity feedback" to aid the ondemand governor to make better
> > > frequency change decisions by hooking into the powersave bias.

I had a quick look at the specification of these registers.
So this seem to be designed and stay very cpufreq specific and other kernel
parts probably won't make use of it.
...
> > > +
> > > + /* this workload is not CPU bound, so choose a lower freq */
> > > + if (sensitivity < od_tuners->powersave_bias) {
> > 
> > Ok, I still didn't get an answer to that: don't we want to use this
> > feature by default, even without looking at ->powersave_bias? I mean,
> > with feedback from the hardware, we kinda know better than the user, no?
> 
> Well, so this powersave_bias also works as a tunable knob.
> 
> From ondemand side, if /sys/../ondemand/powersave_bias is 0, then we
> (AMD sensitivity) don't get called and you get the default ondemand
> behavior.
> 
> Like existing powersave_bias, users can tune the value to whatever
> they want, to get a specturum of less to more aggressive power savings
> vs performance.
I understand powersave_bias code to only be able to do a more
aggressive power saving way:
If you pass 900, a frequency of 90%  (for example 900MHz instead of 1000MHz)
of the one ondemand typically would choose is taken.
powersave_bias values above 1000 (take higher frequencies than the ondemand
would take) are not allowed.

powersave_bias is undocumented in Documentation/cpu-freq/...
I guess its use-case is for people who want to get some percent more
power savings out of their laptop and do not care of the one or other
percent performance.
In fact I would like to get rid of this extra code and I expect nobody would 
miss it.
I might miss a configuration tool where someone went through the code,
documented things and allows users to set powersave_bias values through
some /etc/* config files.
If so, please point me to it.

What your patch misses are some hints how and when to use this at all.
What value should a user write to powersave_bias tunable to activate your 
stuff?
I guess it's also for laptop users to get some percent more battery out of 
their platform and this with an even higher performance rate?
Server guys do not care for some percent of power, but they do care for
some percent of performance.

> I thought tunable would be more flexible .. out in the field or what
> not .. no?
Yep, if you want anyone to make use of this, it should better get embedded
in more general, at least general ondemand code.

   Thomas

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] timer: Fix possible issues with non serialized timer_pending( )

2013-04-03 Thread Thomas Gleixner

Vineet,

On Fri, 29 Mar 2013, Vineet Gupta wrote:

> When stress testing ARC Linux from 3.9-rc3, we've hit a serialization
> issue when mod_timer() races with itself. This is on a FPGA board and
> kernel .config among others has !SMP and !PREEMPT_COUNT.
> 
> The issue happens in mod_timer( ) because timer_pending( ) based early
> exit check is NOT done inside the timer base spinlock - as a networking
> optimization.
> 
> The value used in there, timer->entry.next is also used further in call
> chain (all inlines though) for actual list manipulation. However if the
> register containing this pointer remains live across the spinlock (in a
> UP setup with !PREEMPT_COUNT there's nothing forcing gcc to reload) then
> a stale value of next pointer causes incorrect list manipulation,
> observed with following sequence in our tests.
> 
> (0). tv1[x] <> t1 <---> t2
> (1). mod_timer(t1) interrupted after it calls timer_pending()
> (2). mod_timer(t2) completes
> (3). mod_timer(t1) resumes but messes up the list.
> (4). __runt_timers( ) uses bogus timer_list entry / crashes in
>  timer->function
> 
> The simplest fix is to NOT rely on spinlock based compiler barrier but
> add an explicit one in timer_pending()

That's simple, but dangerous. There is other code which relies on the
implicit barriers of spinlocks, so I think we need to add the barrier
to the !PREEMPT_COUNT implementation of preempt_*() macros.

Thanks,

tglx

> FWIW, the relevant ARCompact disassembly of mod_timer which clearly
> shows the issue due to register reuse is:
> 
> mod_timer:
> push_s blink
> mov_s r13,r0  # timer, timer
> 
> ...
> ## timer_pending( )
> ld_s r3,[r13]   # <-- .entry.next LOADED
> brne r3, 0, @.L163
> 
> .L163:
> 
> ## spin_lock_irq( )
> lr  r5, [status32]  # flags
> bic r4, r5, 6   # temp, flags,
> and.f 0, r5, 6  # flags,
> flag.nz r4
> 
> ## detach_if_pending( ) begins
> 
> tst_s r3,r3  <--
>   # timer_pending( ) checks timer->entry.next
> # r3 is NOT reloaded by gcc, using stale value
> beq.d @.L169
> mov.eq r0,0
> 
> #  detach_timer( ): __list_del( )
> 
> ld r4,[r13,4] # .entry.prev, D.31439
> st r4,[r3,4]  # .prev, D.31439
> st r3,[r4]# .next, D.30246
> 
> Signed-off-by: Vineet Gupta 
> Reported-by: Christian Ruppert 
> Cc: Thomas Gleixner 
> Cc: Christian Ruppert 
> Cc: Pierrick Hascoet 
> Cc: linux-kernel@vger.kernel.org
> ---
>  include/linux/timer.h |   11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/timer.h b/include/linux/timer.h
> index 8c5a197..1537104 100644
> --- a/include/linux/timer.h
> +++ b/include/linux/timer.h
> @@ -168,7 +168,16 @@ static inline void init_timer_on_stack_key(struct 
> timer_list *timer,
>   */
>  static inline int timer_pending(const struct timer_list * timer)
>  {
> - return timer->entry.next != NULL;
> + int pending = timer->entry.next != NULL;
> +
> + /*
> +  * The check above enables timer fast path - early exit.
> +  * However most of the call sites are not protected by timer->base
> +  * spinlock. If the caller (say mod_timer) races with itself, it
> +  * can use the stale "next" pointer. See commit log for details.
> +  */
> + barrier();
> + return pending;
>  }
>  
>  extern void add_timer_on(struct timer_list *timer, int cpu);
> -- 
> 1.7.10.4
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V4 0/2] cpufreq: ondemand: add AMD specific powersave bias

2013-04-04 Thread Thomas Renninger

On Thursday, April 04, 2013 11:19:02 AM Jacob Shin wrote:
> This patchset adds AMD specific powersave bias function to the ondemand
> governor; which can be used to help ondemand governor make more power
> conscious frequency change decisions based on feedback from hardware
> (availble on AMD Family 16h and above).

Either the one way:
  1) Documenting powersave_bias and add the stuff there, best with a default
  set so that the stuff gets used
or
  2) Marking powersave_bias deprecated and embed things into ondemand
  directly

should be fine.

As you give this some usefulness now and it's going to get
used (automatically) and the stuff is even documented, I cannot suggest
anything anymore how to integrate that better.

Acked-by: Thomas Renninger 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 00/22] x86, ACPI, numa: Parse numa info early

2013-04-04 Thread Thomas Renninger

On Thursday, April 04, 2013 04:46:04 PM Yinghai Lu wrote:
> One commit that tried to parse SRAT early get reverted before v3.9-rc1.
> 
> | commit e8d1955258091e4c92d5a975ebd7fd8a98f5d30f
> | Author: Tang Chen 
> | Date:   Fri Feb 22 16:33:44 2013 -0800
> | 
> |acpi, memory-hotplug: parse SRAT before memblock is ready
> 
> It broke several things, like acpi override and fall back path etc.
> 
> This patchset is clean implementation that will parse numa info early.

I tried acpi table overriding, but it did not work for me.
In your tree there seem to miss acpi initrd overriding doku:
Documentation/acpi/initrd_table_override.txt
?
And your tree is 3.6.0-rc6-default+ based, right?

I tried it like that:
mkdir -p kernel/firmware/acpi
cp dsdt.aml kernel/firmware/acpi
find kernel | cpio -H newc --create > /boot/instrumented_initrd
cat /boot/initrd >>/boot/instrumented_initrd

modified /boot/grub/menu.lst and pointed to /boot/instrumented_initrd

-> no override messages in dmesg, no overriding happened at all.

Did I oversee something?

Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 00/22] x86, ACPI, numa: Parse numa info early

2013-04-05 Thread Thomas Renninger

On Thursday, April 04, 2013 08:09:46 PM Yinghai Lu wrote:
> On Thu, Apr 4, 2013 at 7:28 PM, Thomas Renninger  wrote:
> > On Thursday, April 04, 2013 04:46:04 PM Yinghai Lu wrote:
> >> One commit that tried to parse SRAT early get reverted before v3.9-rc1.
> >> 
> >> | commit e8d1955258091e4c92d5a975ebd7fd8a98f5d30f
> >> | Author: Tang Chen 
> >> | Date:   Fri Feb 22 16:33:44 2013 -0800
> >> | 
> >> |acpi, memory-hotplug: parse SRAT before memblock is ready
> >> 
> >> It broke several things, like acpi override and fall back path etc.
> >> 
> >> This patchset is clean implementation that will parse numa info early.
> > 
> > I tried acpi table overriding, but it did not work for me.
> > In your tree there seem to miss acpi initrd overriding doku:
> > Documentation/acpi/initrd_table_override.txt
> 
> http://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/tree/D
> ocumentation/acpi/initrd_table_override.txt?h=for-x86-mm
> > And your tree is 3.6.0-rc6-default+ based, right?
> 
> It is in for-x86-mm branch, should be 3.9-rc5 based.
> 
> http://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/tree/D
> ocumentation/acpi/initrd_table_override.txt?h=for-x86-mm
> 
> can you try
> 
> git checkout -b for-x86-mm origin/for-x86-mm
Argh stupid, I simply put a git clone before:
could be found at:
    git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git 
for-x86-mm

I doubt I will make it today, so I'll try to give it a test on Mo.

Thanks,

   Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 00/22] x86, ACPI, numa: Parse numa info early

2013-04-05 Thread Thomas Renninger

On Thursday, April 04, 2013 08:09:46 PM Yinghai Lu wrote:
...
> can you try
> 
> git checkout -b for-x86-mm origin/for-x86-mm

That worked out much better :)

I see these changes in e820 table, the first part is probably unrelated:

BIOS-e820: [mem 0x-0x0009bbff] usable
...
BIOS-e820: [mem 0x0010-0xba294fff] usable

modified: [mem 0x-0x0fff] reserved
modified: [mem 0x1000-0x0009bbff] usable
modified: [mem 0x0010-0xba27bfff] usable
...
modified: [mem 0xba27c000-0xba2947fc] ACPI data
modified: [mem 0xba2947fd-0xba294fff] usable

And the ACPI data section where the modified tables are placed
seem to get correctly inserted at:
0xba27c000-0xba2947fc

-> 0x187FC == 100,348 bytes
DSDT and FACP (better known as FADT) I passed have a size
of (see dmesg parts below):
0x18709 + 0xF4 bytes = 100,349 bytes.

Ah wait the 0xba2947fc is inclusive, so it should exactly fit.

I then see:
DSDT ACPI table found in initrd [0x378f5208-0x3790d910]
FACP ACPI table found in initrd [0x3790d9a0-0x3790da93]
ACPI: RSDP 000f0410 00024 (v02  INTEL)
ACPI: XSDT bdf24d98 0008C (v01  INTEL   ROMLEY 06222004 INTL 20090903)
ACPI: Override [FACP-  ROMLEY], this is unsafe: tainting kernel
Disabling lock debugging due to kernel taint
ACPI: FACP bdf24a98 Physical table override, new table: ff4af709
ACPI: FACP ba294709 000F4 (v04  INTEL   ROMLEY 06222004 INTL 20121220)
ACPI BIOS Bug: Warning: Invalid length for FADT/Pm1aControlBlock: 32, using 
default 16 (20130117/tbfadt-649)
ACPI: Override [DSDT-  ROMLEY], this is unsafe: tainting kernel
ACPI: DSDT bdf09018 Physical table override, new table: ff4af000
ACPI: DSDT ba27c000 18709 (v02  INTEL   ROMLEY 0021 INTL 20121220)

Later I see my debug string added to the DSDT when the
PCI Routing Table (_PRT) is processed:
[9.505419] [ACPI Debug]  String [0x0A] "XX"

And taking the FADT from /sys/firmware/acpi/tables/FACP:
my:
PM Profile : 04 [Enterprise Server]
changed (as expected) to:
PM Profile : 02 [Mobile]

>From acpi overriding parts:
Tested-by: Thomas Renninger 

I also went through the override related patches and from
what I can judge (certainly not the early memory, flat 32 bit memory you call 
it?
specific parts), they look fine.

Nice work!

  Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] sched_clock: Prevent 64bit inatomicity on 32bit systems

2013-04-06 Thread Thomas Gleixner

The sched_clock_remote() implementation has the following inatomicity
problem on 32bit systems when accessing the remote scd->clock, which
is a 64bit value.

CPU0CPU1

sched_clock_local() sched_clock_remote(CPU0)
...
remote_clock = scd[CPU0]->clock
read_low32bit(scd[CPU0]->clock)
cmpxchg64(scd->clock,...)
read_high32bit(scd[CPU0]->clock)

While the update of scd->clock is using an atomic64 mechanism, the
readout on the remote cpu is not, which can cause completely bogus
readouts.

It is a quite rare problem, because it requires the update to hit the
narrow race window between the low/high readout and the update must go
across the 32bit boundary.

The resulting misbehaviour is, that CPU1 will see the sched_clock on
CPU1 ~4 seconds ahead of it's own and update CPU1s sched_clock value
to this bogus timestamp. This stays that way due to the clamping
implementation for about 4 seconds until the synchronization with
CLOCK_MONOTONIC undoes the problem.

The issue is hard to observe, because it might only result in a less
accurate SCHED_OTHER timeslicing behaviour. To create observable
damage on realtime scheduling classes, it is necessary that the bogus
update of CPU1 sched_clock happens in the context of an realtime
thread, which then gets charged 4 seconds of RT runtime, which results
in the RT throttler mechanism to trigger and prevent scheduling of RT
tasks for a little less than 4 seconds. So this is quite unlikely as
well.

The issue was quite hard to decode as the reproduction time is between
2 days and 3 weeks and intrusive tracing makes it less likely, but the
following trace recorded with trace_clock=global, which uses
sched_clock_local(), gave the final hint:

  -0   0d..30 400269.477150: hrtimer_cancel: hrtimer=0xf7061e80
  -0   0d..30 400269.477151: hrtimer_start:  hrtimer=0xf7061e80 ...
irq/20-S-587 1d..32 400273.772118: sched_wakeup:   comm= ... target_cpu=0
  -0   0dN.30 400273.772118: hrtimer_cancel: hrtimer=0xf7061e80

What happens is that CPU0 goes idle and invokes
sched_clock_idle_sleep_event() which invokes sched_clock_local() and
CPU1 runs a remote wakeup for CPU0 at the same time, which invokes
sched_remote_clock(). The time jump gets propagated to CPU0 via
sched_remote_clock() and stays stale on both cores for ~4 seconds.

There are only two other possibilities, which could cause a stale
sched clock:

1) ktime_get() which reads out CLOCK_MONOTONIC returns a sporadic
   wrong value.

2) sched_clock() which reads the TSC returns a sporadic wrong value.

#1 can be excluded because sched_clock would continue to increase for
   one jiffy and then go stale.

#2 can be excluded because it would not make the clock jump
   forward. It would just result in a stale sched_clock for one jiffy.

After quite some brain twisting and finding the same pattern on other
traces, sched_clock_remote() remained the only place which could cause
such a problem and as explained above it's indeed racy on 32bit
systems.

So while on 64bit systems the readout is atomic, we need to verify the
remote readout on 32bit machines. We need to protect the local->clock
readout in sched_clock_remote() on 32bit as well because an NMI could
hit between the low and the high readout, call sched_clock_local() and
modify local->clock.

Thanks to Siegfried Wulsch for bearing with my debug requests and
going through the tedious tasks of running a bunch of reproducer
systems to generate the debug information which let me decode the
issue.

Reported-by: Siegfried Wulsch 
Signed-off-by: Thomas Gleixner 
Cc: sta...@vger.kernel.org

---
Index: linux-stable/kernel/sched/clock.c
===
--- linux.orig/kernel/sched/clock.c
+++ linux/kernel/sched/clock.c
@@ -176,10 +176,36 @@ static u64 sched_clock_remote(struct sch
u64 this_clock, remote_clock;
u64 *ptr, old_val, val;
 
+#if BITS_PER_LONG != 64
+again:
+   /*
+* Careful here: The local and the remote clock values need to
+* be read out atomic as we need to compare the values and
+* then update either the local or the remote side. So the
+* cmpxchg64 below only protects one readout.
+*
+* We must reread via sched_clock_local() in the retry case on
+* 32bit as an NMI could use sched_clock_local() via the
+* tracer and hit between the readout of
+* the low32bit and the high 32bit portion.
+*/
+   this_clock = sched_clock_local(my_scd);
+   /*
+* We must enforce atomic readout on 32bit, otherwise the
+* update on the remote cpu can hit inbetween the readout of
+* the low32bit and the high 32bit portion.
+*/
+   remote_clock = cmpxchg64(&scd->clock, 0, 0);
+#else
+   /*
+* On 64bit the read of [my]scd->clock is atomic versus the
+* u

Re: [PATCH 2/2] x86 e820: Introduce memmap=resetusablemap for kdump usage

2013-01-28 Thread Thomas Renninger

On Wednesday, January 23, 2013 08:07:19 PM Yinghai Lu wrote:
> On Tue, Jan 22, 2013 at 12:06 PM, Yinghai Lu  wrote:
> > On Tue, Jan 22, 2013 at 8:32 AM, H. Peter Anvin  wrote:
> >>> Again: Please explain what is bad with this solution.
> >>> I cannot see a better and more robust way for kdump other than
> >>> reserving the original reserved memory areas as declared by the BIOS.
> >> 
> >> It is bad because it creates more complexity than is needed.
> >> 
> >> The whole point is that what we want is simply to switch type 1 to type
> >> X, with the sole exceptions being the areas explicitly reserved for the
> >> kdump kernel.
> > 
> > Do you prefer to  "reserveram" way in attached patch?
> 
> Hi, Thomas,
> 
> Can you please check attached reserveram version on your setup?
> 
> If it is ok, i will put it in for-x86-boot patchset and send it to
> Peter for v3.9.

But this (converting usable memory to reserved one before usable kdump memory
is added) will let machines run into problems again for which the check:
"mmconf area must be in reserved memory" got added?

If, then memory which was usable before has to be converted to a special
E820_KUMP (or whatever type) to make sure existing checks which look for
"is reserved memory" still work the same way as in a productive kernel.

Advantage of this would be that the info what originally was usable
memory is preserved and can be used in future kdump related patches.

So I guess the final patch should be:
   - Add a new e820 type:
E820_KDUMP_RESERVED /* Originally usable memory where the crashed
kernel kernel resided in */
  - Use Yinghai's last posted patch, but instead of:
+   e820_update_range(0, ULLONG_MAX, E820_RAM,
+ E820_RESERVED);
...
+   e820_remove_range(start_at, mem_size, E820_RESERVED, 0);
do:
+   e820_update_range(0, ULLONG_MAX, E820_RAM,
+ E820_KDUMP_RESERVED);
...
+   e820_remove_range(start_at, mem_size, 
E820_KDUMP_RESERVED, 0);

  - Come up with another memmap=kdump_reserve_ram memmap option name
or however it should get named...

If this proposal gets accepted, I can send a tested patch...

   Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump with iommu

2013-01-28 Thread Thomas Renninger

On Thursday, January 24, 2013 09:23:14 AM Takao Indoh wrote:
> (2013/01/23 9:47), Thomas Renninger wrote:
> > On Monday, January 21, 2013 10:11:04 AM Takao Indoh wrote:
> >> (2013/01/08 4:09), Thomas Renninger wrote:
> > ...
> > 
> >>> I tried the provided patches first on 2.6.32, then I verfied with
> >>> 3.8-rc2
> >>> and in both cases the disk is not detected anymore in
> >>> reset_devices (kexec'ed/kdump) case (but things work fine without these
> >>> patches).
> >> 
> >> So the problem that the disk is not detected was caused by exactmap
> >> problem you guys are discussing? Or still not detected even if exactmap
> >> problem is fixed?
> > 
> > This problem is related to the 5 PCI resetting patches.
> > Dumping worked with a 2.6.32 and a 3.8-rc2 kernel, adding the PCI
> > resetting
> > patches broke both. I first tried 2.6.32 and verified with 3.8-rc2 to make
> > sure I didn't mess up the backport adjustings of the patches to 2.6.32.
> If you have a chance please try again the patches with the latest
> firmware.
Not sure I can update the firmware as this is a reference platform used exactly
like this in production.

I also cannot see how this could help.

   Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

the patch "perf tools: Update Makefile for Android" broke 3.8-rc perf build.

2013-01-28 Thread Thomas Backlund


Linux Kernel Mailing List skrev 12.12.2012 05:13:

Gitweb: 
http://git.kernel.org/linus/;a=commit;h=d816ec2d1bea55cfeac373f0ab0ab8a3105e49b4
Commit: d816ec2d1bea55cfeac373f0ab0ab8a3105e49b4
Parent: 78da39faf7c903bb6e3c20a726fde1bf98d10af8
Author: Irina Tirdea 
AuthorDate: Mon Oct 8 09:43:27 2012 +0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon Oct 8 17:42:16 2012 -0300

 perf tools: Update Makefile for Android

 For cross-compiling on Android, some specific changes are needed in
 the Makefile.



The above patch broke perf build on i586 and x86_64:

[tmb@tmb linux-3.8-rc5]$ make -C tools/perf -s V=1 HAVE_CPLUS_DEMANGLE=1 
prefix=%{_prefix} all

CHK -fstack-protector-all
CHK -Wstack-protector
CHK -Wvolatile-register-var
CHK bionic
:1:31: fatal error: android/api-level.h: No such file or directory
compilation terminated.


This is a regression since 3.7
--
Thomas

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 11/74] perf python: Fix breakage introduced by the test_attr infrastructure

2013-01-28 Thread Thomas Backlund


Arnaldo Carvalho de Melo skrev 24.1.2013 22:07:

From: Arnaldo Carvalho de Melo 

The test_attr infrastructure hooks on the sys_perf_event_open call,
checking if a variable is set and if so calling a function to intercept
calls and do the checking.

But both the variable and the function aren't on objects that are
linked on the python binding, breaking it:




Atleast this one is 3.8 material as it is a regression since 3.7

[tmb@tmb linux-3.8-rc5]$ make -C tools/perf -s V=1 HAVE_CPLUS_DEMANGLE=1 
prefix=%{_prefix} all


python_ext_build/tmp/util/evsel.o: In function `sys_perf_event_open':
/mnt/work/Mageia/RPM/1Work/kerenels/linux-3.8-rc5/tools/perf/util/../perf.h:183: 
undefined reference to `test_attr__enabled'
/mnt/work/Mageia/RPM/1Work/kerenels/linux-3.8-rc5/tools/perf/util/../perf.h:184: 
undefined reference to `test_attr__open'

collect2: ld returned 1 exit status
error: command 'gcc' failed with exit status 1

--
Thomas




   # perf test -v 15
   15: Try 'use perf' in python, checking link problems   :
   --- start ---
   Traceback (most recent call last):
 File "", line 1, in 
   ImportError: /home/acme/git/build/perf//python/perf.so: undefined symbol: 
test_attr__enabled
    end 
   Try 'use perf' in python, checking link problems: FAILED!
   #

Fix it by moving the variable to one of the linked object files and
providing a stub for the function in the python.o object, that is only
linked in the python binding.

Now 'perf test' is happy again:

   # perf test 15
   15: Try 'use perf' in python, checking link problems   : Ok
   #

Cc: David Ahern 
Cc: Frederic Weisbecker 
Cc: Jiri Olsa 
Cc: Mike Galbraith 
Cc: Namhyung Kim 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Link: http://lkml.kernel.org/n/tip-0rsca2kn44b38rgdpr3tz...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
  tools/perf/tests/attr.c  | 2 --
  tools/perf/util/python.c | 9 +
  tools/perf/util/util.c   | 2 ++
  3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/attr.c b/tools/perf/tests/attr.c
index 25638a9..05b5acb 100644
--- a/tools/perf/tests/attr.c
+++ b/tools/perf/tests/attr.c
@@ -33,8 +33,6 @@

  extern int verbose;

-bool test_attr__enabled;
-
  static char *dir;

  void test_attr__init(void)
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index a2657fd..925e0c3 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -1045,3 +1045,12 @@ error:
if (PyErr_Occurred())
PyErr_SetString(PyExc_ImportError, "perf: Init failed!");
  }
+
+/*
+ * Dummy, to avoid dragging all the test_attr infrastructure in the python
+ * binding.
+ */
+void test_attr__open(struct perf_event_attr *attr, pid_t pid, int cpu,
+ int fd, int group_fd, unsigned long flags)
+{
+}
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 5906e84..252b889 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -12,6 +12,8 @@
   */
  unsigned int page_size;

+bool test_attr__enabled;
+
  bool perf_host  = true;
  bool perf_guest = false;




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

perf 3.8-rc build failure: undefined reference to `strlcpy'

2013-01-28 Thread Thomas Backlund



[tmb@tmb linux-3.8-rc5]$ make -C tools/perf -s V=1 HAVE_CPLUS_DEMANGLE=1 
prefix=%{_prefix} all


...

/tmp/ccJEJv6m.o: In function `main':
:(.text+0x14): undefined reference to `strlcpy'
collect2: ld returned 1 exit status

...

This did not show up in 3.7

--
Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] x86 e820: Introduce memmap=resetusablemap for kdump usage

2013-01-29 Thread Thomas Renninger

On Tuesday, January 29, 2013 03:10:38 AM Yinghai Lu wrote:
> On Mon, Jan 28, 2013 at 5:11 PM, H. Peter Anvin  wrote:
> >> So I guess the final patch should be:
> >>- Add a new e820 type:
> >> E820_KDUMP_RESERVED /* Originally usable memory where the crashed
> >> kernel kernel resided 
> >> in */
> >>   - Use Yinghai's last posted patch, but instead of:
> >> + e820_update_range(0, ULLONG_MAX, E820_RAM,
> >> +   E820_RESERVED);
> >> ...
> >> + e820_remove_range(start_at, mem_size, E820_RESERVED, 
> >> 0);
> >> do:
> >> + e820_update_range(0, ULLONG_MAX, E820_RAM,
> >> +   E820_KDUMP_RESERVED);
> >> ...
> >> + e820_remove_range(start_at, mem_size, 
> >> E820_KDUMP_RESERVED, 0);
> >>
> >>   - Come up with another memmap=kdump_reserve_ram memmap option name
> >> or however it should get named...
> >>
> >> If this proposal gets accepted, I can send a tested patch...
> >>
> >
> > Yes, this is much saner.  There really shouldn't need to be an option,
> > even; since the tools need to be modified anyway, just modify the actual
> > memory map data structure itself.
> 
> yes,
> 
> kexec-tools will change that to E820_KDUMP_RESERVED (or other good name).
> 
> We only need to update kernel to get old max_pfn by
> checking E820_KDUMP_RESERVED.

Wait, above proposal does not include kexec-tools mangling of the
e820 table, for several reasons:

- Keep the boot interface clean and pass the original table
- Only one possible error source on e820 table modifications
- While hpa proposed kexec-tools to pass a modified e820 table to
  make things easier, exactly the opposite is the case:
  If kexec-tools and the kernel modify the table, things are more
  complex and hard to understand in case of debugging where things
  went wrong
- It's really easy to do that in the kernel. As shown above it should
  simply be this line to change usable areas into E820_KDUMP_RESERVED
  ones:
  e820_update_range(0, ULLONG_MAX, E820_RAM, E820_KDUMP_RESERVED);
  and possibly slight adjusting when the memmap=X#Y memory
  the kdump kernel uses is added (has to override E820_KDUMP_RESERVED
  areas with usable memory again)

My previously posted kexec-tools patches should simply work,
it's just that the memmap option name changes to:
memmap=kdump_reserve_ram

This is what I proposed and is IMO the best and less complex
way to go. I guess I still wait another day for comments and
will send something if you agree.

   Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/5] net: mvmdio: unmap base register address at driver removal

2013-01-29 Thread Thomas Petazzoni

Dear Florian Fainelli,

On Tue, 29 Jan 2013 16:24:04 +0100, Florian Fainelli wrote:
> Fix the driver remove callback to unmap the base register address and
> not leak this mapping after the driver has been removed.
> 
> Signed-off-by: Florian Fainelli 

What about using devm_request_and_ioremap() instead, in order to get
automatic unmap on error and in the ->remove() path?

But maybe it won't work because this memory range is claimed both by
the MDIO driver and the Ethernet driver itself. In that case, you could
use devm_ioremap().

Best regards,

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/5] net: mvmdio: rename base register cookie from smireg to regs

2013-01-29 Thread Thomas Petazzoni

Dear Florian Fainelli,

On Tue, 29 Jan 2013 16:24:05 +0100, Florian Fainelli wrote:
> This patch renames the base register cookie in the mvmdio drive from
> "smireg" to "regs" since a subsequent patch is going to use an ioremap()
> cookie whose size is larger than a single register of 4 bytes. No
> functionnal code change introduced.
> 
> Signed-off-by: Florian Fainelli 

Acked-by: Thomas Petazzoni 
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/5] net: mvmdio: enhance driver to support SMI error/done interrupts

2013-01-29 Thread Thomas Petazzoni

Dear Florian Fainelli,

On Tue, 29 Jan 2013 16:24:06 +0100, Florian Fainelli wrote:

>  #define MVMDIO_SMI_DATA_SHIFT  0
>  #define MVMDIO_SMI_PHY_ADDR_SHIFT  16
> @@ -36,12 +40,28 @@
>  #define MVMDIO_SMI_WRITE_OPERATION 0
>  #define MVMDIO_SMI_READ_VALID  BIT(27)
>  #define MVMDIO_SMI_BUSYBIT(28)
> +#define MVMDIO_ERR_INT_CAUSE0x007C
> +#define  MVMDIO_ERR_INT_SMI_DONE0x0010
> +#define MVMDIO_ERR_INT_MASK 0x0080
>  
>  struct orion_mdio_dev {
>   struct mutex lock;
>   void __iomem *regs;
> + /*
> +  * If we have access to the error interrupt pin (which is
> +  * somewhat misnamed as it not only reflects internal errors
> +  * but also reflects SMI completion), use that to wait for
> +  * SMI access completion instead of polling the SMI busy bit.
> +  */
> + int err_interrupt;
> + wait_queue_head_t smi_busy_wait;
>  };
>  
> +static int orion_mdio_smi_is_done(struct orion_mdio_dev *dev)
> +{
> + return !(readl(dev->regs) & MVMDIO_SMI_BUSY);
> +}
> +
>  /* Wait for the SMI unit to be ready for another operation
>   */
>  static int orion_mdio_wait_ready(struct mii_bus *bus)
> @@ -50,19 +70,30 @@ static int orion_mdio_wait_ready(struct mii_bus *bus)
>   int count;
>   u32 val;
>  
> - count = 0;
> - while (1) {
> - val = readl(dev->regs);
> - if (!(val & MVMDIO_SMI_BUSY))
> - break;
> -
> - if (count > 100) {
> - dev_err(bus->parent, "Timeout: SMI busy for too 
> long\n");
> - return -ETIMEDOUT;
> + if (dev->err_interrupt == NO_IRQ) {
> + count = 0;
> + while (1) {
> + val = readl(dev->regs);
> + if (!(val & MVMDIO_SMI_BUSY))
> + break;

What about using your new orion_mdio_smi_is_done() function here?

> +
> + if (count > 100) {
> + dev_err(bus->parent,
> + "Timeout: SMI busy for too long\n");
> + return -ETIMEDOUT;
> + }
> +
> + udelay(10);
> + count++;
>   }
> + }
>  
> - udelay(10);
> - count++;
> + if (!orion_mdio_smi_is_done(dev)) {

Maybe it should be in an else if block so that the waitqueue case is
only considered if there is an IRQ registered? Of course practically
speaking, it's OK because if there is no IRQ, we'll wait in the polling
loop above, and either exit from the function on timeout, or continue
on success. But it still would make the code a little bit clearer, I'd
say.

>  static int orion_mdio_probe(struct platform_device *pdev)
>  {
>   struct device_node *np = pdev->dev.of_node;
> @@ -181,6 +227,19 @@ static int orion_mdio_probe(struct platform_device *pdev)
>   return -ENODEV;
>   }
>  
> + dev->err_interrupt = NO_IRQ;

Not needed, you already do dev->err_interrupt = something() below.

> + init_waitqueue_head(&dev->smi_busy_wait);
> +
> + dev->err_interrupt = irq_of_parse_and_map(pdev->dev.of_node, 0);
> + if (dev->err_interrupt != NO_IRQ) {
> + ret = devm_request_irq(&pdev->dev, dev->err_interrupt,
> + orion_mdio_err_irq,
> + IRQF_SHARED, pdev->name, dev);
> + if (!ret)
> + writel(MVMDIO_ERR_INT_SMI_DONE,
> + dev->regs + MVMDIO_ERR_INT_MASK);
> + }
> +
>   mutex_init(&dev->lock);
>  
>   ret = of_mdiobus_register(bus, np);
> @@ -202,6 +261,8 @@ static int orion_mdio_remove(struct platform_device *pdev)
>   struct mii_bus *bus = platform_get_drvdata(pdev);
>   struct orion_mdio_dev *dev = bus->priv;
>  
> + writel(0, dev->regs + MVMDIO_ERR_INT_MASK);
> + free_irq(dev->err_interrupt, dev);

free_irq() not needed since the IRQ handler is registered with
devm_request_irq().

>   mdiobus_unregister(bus);
>   kfree(bus->irq);
>   mdiobus_free(bus);

Thanks,

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/5] net: mvmdio: allow Device Tree and platform device to coexist

2013-01-29 Thread Thomas Petazzoni

Dear Florian Fainelli,

On Tue, 29 Jan 2013 16:24:07 +0100, Florian Fainelli wrote:
> This patch changes the Marvell MDIO driver to be registered by using
> both Device Tree and platform device methods. The driver voluntarily
> does not use devm_ioremap() to share the same error path for Device Tree
> and non-Device Tree cases.

Not sure why you think devm_ioremap() can't be used here. Maybe I'm
missing something, but could you explain? If you use devm_ioremap(),
then basically you don't need to do anything in the error path
regarding to the I/O mapping... since it's the whole purpose of the
devm_*() stuff to automagically undo things in the error case, and in
the ->remove() code.

> - dev->err_interrupt = irq_of_parse_and_map(pdev->dev.of_node, 0);
> + if (pdev->dev.of_node) {
> + dev->regs = of_iomap(pdev->dev.of_node, 0);
> + if (!dev->regs) {
> + dev_err(&pdev->dev, "No SMI register address given in 
> DT\n");
> + ret = -ENODEV;
> + goto out_free;
> + }
> +
> + dev->err_interrupt = irq_of_parse_and_map(pdev->dev.of_node, 0);
> + } else {
> + r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +
> + dev->regs = ioremap(r->start, resource_size(r));
> + if (!dev->regs) {
> + dev_err(&pdev->dev, "No SMI register address given\n");
> + ret = -ENODEV;
> + goto out_free;
> + }
> +
> + dev->err_interrupt = platform_get_irq(pdev, 0);
> + }

I think you can do a devm_ioremap() and a platform_get_irq() in both
cases here, and therefore keep the code common between the DT case and
the !DT case.

Thanks,

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/5] mv643xx_eth: convert to use the Marvell Orion MDIO driver

2013-01-29 Thread Thomas Petazzoni

Dear Florian Fainelli,

On Tue, 29 Jan 2013 16:24:08 +0100, Florian Fainelli wrote:
> This patch converts the Marvell MV643XX ethernet driver to use the
> Marvell Orion MDIO driver. As a result, PowerPC and ARM platforms
> registering the Marvell MV643XX ethernet driver are also updated to
> register a Marvell Orion MDIO driver. This driver voluntarily overlaps
> with the Marvell Ethernet shared registers because it will use a subset
> of this shared register (shared_base + 0x4 - shared_base + 0x84). The
> Ethernet driver is also updated to look up for a PHY device using the
> Orion MDIO bus driver.
> 
> Signed-off-by: Florian Fainelli 
> ---
>  arch/arm/plat-orion/common.c   |   84 +++--

In this file, there was one "MV643XX_ETH_SHARED_NAME" platform_device
registered for each network interface. Why? If the driver is shared,
isn't the whole idea to register it only once?

In any case, one of the idea of separating the mvmdio driver from the
mvneta driver in the first place is that there should be only one
instance of the mvmdio device, even if there are multiple network
interfaces. The reason is that from a HW point of the view, the MDIO
unit is shared between the network interfaces. If you look at
armada-370-xp.dtsi, there is only one mvmdio device registered, and two
network interfaces (using the mvneta driver) that are registered (and
actually up to four network interfaces can exist, they are added by
some other .dtsi files depending on the specific SoC).

So I don't think there should be one instance of the mvmdio per network
interface.

Also, I am wondering what's left in this MV643XX_ETH_SHARED_NAME driver
once the MDIO stuff has been pulled out in a separate driver? I think
the whole point of this work should be to get rid of this
MV643XX_ETH_SHARED_NAME driver, no?

Thanks,

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/5] mv643xx_eth: convert to use the Marvell Orion MDIO driver

2013-01-29 Thread Thomas Petazzoni

Dear Florian Fainelli,

On Tue, 29 Jan 2013 17:27:56 +0100, Florian Fainelli wrote:

> It looks like I introduced two redundant mvmdio instances as ge01
> refers to the ge00 smi bus (the same applies to ge11 and ge10).
> Thanks for spotting this.

Ok, good.

> If you take a closer look at mv643xx_eth you will see that the
> "shared" driver still handles the mconf bus window configuration,
> which is not abstracted yet.

Indeed, I've seen that. But I don't understand why it's done in the
mv643xx_eth_shared_probe(). The mbus window configuration registers are
per-network interface, so this call to mv643xx_eth_conf_mbus_windows()
could presumably be done in mv643xx_eth_probe().

At least in mvneta, we have the same registers, and we do their
initialization in the driver normal (and only) ->probe() routine.

> Besides that, I would rather do it step by step.

Yes, agreed. But I think it would be good to have followed patches that
progressively get rid of the shared driver thing, as it will help in
bringing a proper DT binding in the mv643xx_eth driver. But it
certainly doesn't need to be part of this specific patch.

Thanks,

Thomas
-- 
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] cpuidle: fix new C-states not functional after AC disconnect

2013-01-30 Thread Thomas Schlichter

Am Sonntag, 13. Januar 2013, 21:44:41 schrieb Daniel Lezcano:
> On 01/13/2013 09:36 PM, Sedat Dilek wrote:
> > 0001: Refreshed 1-2 as v3 against Linux v3.8-rc3.
> > 0002: v2 of 2-2 applied cleanly after 1-2 was refreshed!
> 
> Hi Sedat,
> 
> for the moment, you should use only the 1/2 because 2/2 (which is an
> optimization) is wrong.

Hi Daniel,

thanks again for this patch, this together with my patch finally fix the bug.
Now I recognized that only my patch was also sent to stable (thanks Rafael), 
yours not. So the bug is not completely fixed in 3.4 and 3.7.

Is there a reason for not sending this to stable, too?

Kind regards,
  Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel failing to boot when compressed with bzip2

2013-01-30 Thread Thomas Capricelli


Il 30/01/2013 14:58, Rob Landley ha scritto:
>
> On 01/20/2013 03:55:05 PM, Thomas Capricelli wrote:
>
>> So my guess is that there's something badly broken in the bzip2 kernel
>> decompressing code.. ? There's both a regression between kernel 3.6 and
>> 3.7, and a problem with gcc-4.7.

> Worked For Me. I note that I built with gcc 4.2.1 and binutils 2.17
> (the last GPLv2 releases), and it's for i686

I guess that's the main point : my bug appears on AMD64, not i686, and i
can trigger It with either gcc-4.6 or gcc-4.7. I'm pretty sure it still
works with such an old gcc as 4.2.1.
I use binutils 2.23.1 but i doubt it matters here.

Again, amd64 using vanilla kernel releases (3.x.y)
linux-3.6 with gcc-4.6 : ok
linux-3.6 with gcc-4.7 : fail
linux-3.7 with gcc-4.6 : fail
linux-3.7 with gcc-4.7 : fail

I've had this BUNZIP2 option for very long, so i'm sure it was working
well with previous versions of both gcc and linux kernel.

greetings,
Thomas

-- 
Thomas Capricelli 
http://www.freehackers.org/thomas/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 05/40] cpu: Restructure cpu_down code

2013-01-31 Thread Thomas Gleixner

Split out into separate functions, so we can convert it to a state machine.

Signed-off-by: Thomas Gleixner 
---
 kernel/cpu.c |   69 ---
 1 file changed, 47 insertions(+), 22 deletions(-)

Index: linux-2.6/kernel/cpu.c
===
--- linux-2.6.orig/kernel/cpu.c
+++ linux-2.6/kernel/cpu.c
@@ -168,6 +168,43 @@ static int cpu_notify(unsigned long val,
return __cpu_notify(val, cpu, -1, NULL);
 }
 
+/* Notifier wrappers for transitioning to state machine */
+static int notify_prepare(unsigned int cpu)
+{
+   int nr_calls = 0;
+   int ret;
+
+   ret = __cpu_notify(CPU_UP_PREPARE, cpu, -1, &nr_calls);
+   if (ret) {
+   nr_calls--;
+   printk(KERN_WARNING "%s: attempt to bring up CPU %u failed\n",
+   __func__, cpu);
+   __cpu_notify(CPU_UP_CANCELED, cpu, nr_calls, NULL);
+   }
+   return ret;
+}
+
+static int notify_online(unsigned int cpu)
+{
+   cpu_notify(CPU_ONLINE, cpu);
+   return 0;
+}
+
+static int bringup_cpu(unsigned int cpu)
+{
+   struct task_struct *idle = idle_thread_get(cpu);
+   int ret;
+
+   /* Arch-specific enabling code. */
+   ret = __cpu_up(cpu, idle);
+   if (ret) {
+   cpu_notify(CPU_UP_CANCELED, cpu);
+   return ret;
+   }
+   BUG_ON(!cpu_online(cpu));
+   return 0;
+}
+
 #ifdef CONFIG_HOTPLUG_CPU
 
 static void cpu_notify_nofail(unsigned long val, unsigned int cpu)
@@ -340,7 +377,7 @@ EXPORT_SYMBOL(cpu_down);
 static int __cpuinit _cpu_up(unsigned int cpu, int tasks_frozen)
 {
struct task_struct *idle;
-   int ret, nr_calls = 0;
+   int ret;
 
cpu_hotplug_begin();
 
@@ -355,35 +392,23 @@ static int __cpuinit _cpu_up(unsigned in
goto out;
}
 
+   cpuhp_tasks_frozen = tasks_frozen;
+
ret = smpboot_create_threads(cpu);
if (ret)
goto out;
 
-   cpuhp_tasks_frozen = tasks_frozen;
-
-   ret = __cpu_notify(CPU_UP_PREPARE, cpu, -1, &nr_calls);
-   if (ret) {
-   nr_calls--;
-   printk(KERN_WARNING "%s: attempt to bring up CPU %u failed\n",
-   __func__, cpu);
-   goto out_notify;
-   }
+   ret = notify_prepare(cpu);
+   if (ret)
+   goto out;
 
-   /* Arch-specific enabling code. */
-   ret = __cpu_up(cpu, idle);
-   if (ret != 0)
-   goto out_notify;
-   BUG_ON(!cpu_online(cpu));
+   ret = bringup_cpu(cpu);
+   if (ret)
+   goto out;
 
/* Wake the per cpu threads */
smpboot_unpark_threads(cpu);
-
-   /* Now call notifier in preparation. */
-   cpu_notify(CPU_ONLINE, cpu);
-
-out_notify:
-   if (ret != 0)
-   __cpu_notify(CPU_UP_CANCELED, cpu, nr_calls, NULL);
+   notify_online(cpu);
 out:
cpu_hotplug_done();
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 15/40] x86: perf: Convert AMD IBS to hotplug state machine

2013-01-31 Thread Thomas Gleixner

Install the callbacks via the state machine and let the core invoke
the callbacks on the already online cpus.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/cpu/perf_event_amd_ibs.c |   54 +++
 include/linux/cpuhotplug.h   |1 
 2 files changed, 21 insertions(+), 34 deletions(-)

Index: linux-2.6/arch/x86/kernel/cpu/perf_event_amd_ibs.c
===
--- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ linux-2.6/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -637,13 +637,10 @@ static __init int perf_ibs_pmu_init(stru
return ret;
 }
 
-static __init int perf_event_ibs_init(void)
+static __init void perf_event_ibs_init(void)
 {
struct attribute **attr = ibs_op_format_attrs;
 
-   if (!ibs_caps)
-   return -ENODEV; /* ibs not supported by the cpu */
-
perf_ibs_pmu_init(&perf_ibs_fetch, "ibs_fetch");
 
if (ibs_caps & IBS_CAPS_OPCNT) {
@@ -654,13 +651,11 @@ static __init int perf_event_ibs_init(vo
 
register_nmi_handler(NMI_LOCAL, perf_ibs_nmi_handler, 0, "perf_ibs");
printk(KERN_INFO "perf: AMD IBS detected (0x%08x)\n", ibs_caps);
-
-   return 0;
 }
 
 #else /* defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_AMD) */
 
-static __init int perf_event_ibs_init(void) { return 0; }
+static __init void perf_event_ibs_init(void) { }
 
 #endif
 
@@ -827,11 +822,10 @@ static inline int get_ibs_lvt_offset(voi
return val & IBSCTL_LVT_OFFSET_MASK;
 }
 
-static void setup_APIC_ibs(void *dummy)
+static void setup_APIC_ibs(void)
 {
-   int offset;
+   int offset = get_ibs_lvt_offset();
 
-   offset = get_ibs_lvt_offset();
if (offset < 0)
goto failed;
 
@@ -842,30 +836,19 @@ failed:
smp_processor_id());
 }
 
-static void clear_APIC_ibs(void *dummy)
+static int __cpuinit x86_pmu_amd_ibs_starting_cpu(unsigned int cpu)
 {
-   int offset;
-
-   offset = get_ibs_lvt_offset();
-   if (offset >= 0)
-   setup_APIC_eilvt(offset, 0, APIC_EILVT_MSG_FIX, 1);
+   setup_APIC_ibs();
+   return 0;
 }
 
-static int __cpuinit
-perf_ibs_cpu_notifier(struct notifier_block *self, unsigned long action, void 
*hcpu)
+static int __cpuinit x86_pmu_amd_ibs_dying_cpu(unsigned int cpu)
 {
-   switch (action & ~CPU_TASKS_FROZEN) {
-   case CPU_STARTING:
-   setup_APIC_ibs(NULL);
-   break;
-   case CPU_DYING:
-   clear_APIC_ibs(NULL);
-   break;
-   default:
-   break;
-   }
+   int offset = get_ibs_lvt_offset();
 
-   return NOTIFY_OK;
+   if (offset >= 0)
+   setup_APIC_eilvt(offset, 0, APIC_EILVT_MSG_FIX, 1);
+   return 0;
 }
 
 static __init int amd_ibs_init(void)
@@ -889,15 +872,18 @@ static __init int amd_ibs_init(void)
if (!ibs_eilvt_valid())
goto out;
 
-   get_online_cpus();
ibs_caps = caps;
/* make ibs_caps visible to other cpus: */
smp_mb();
-   perf_cpu_notifier(perf_ibs_cpu_notifier);
-   smp_call_function(setup_APIC_ibs, NULL, 1);
-   put_online_cpus();
+   /*
+* x86_pmu_amd_ibs_starting_cpu will be called from core on
+* all online cpus.
+*/
+   cpuhp_setup_state(CPUHP_AP_PERF_X86_AMD_IBS_STARTING,
+ x86_pmu_amd_ibs_starting_cpu,
+ x86_pmu_amd_ibs_dying_cpu);
 
-   ret = perf_event_ibs_init();
+   perf_event_ibs_init();
 out:
if (ret)
pr_err("Failed to setup IBS, %d\n", ret);
Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -14,6 +14,7 @@ enum cpuhp_states {
CPUHP_AP_OFFLINE,
CPUHP_AP_SCHED_STARTING,
CPUHP_AP_PERF_X86_UNCORE_STARTING,
+   CPUHP_AP_PERF_X86_AMD_IBS_STARTING,
CPUHP_AP_PERF_X86_STARTING,
CPUHP_AP_NOTIFY_STARTING,
CPUHP_AP_NOTIFY_DYING,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 39/40] relayfs: Convert to hotplug state machine

2013-01-31 Thread Thomas Gleixner

From: Richard Weinberger 

Signed-off-by: Richard Weinberger 
Signed-off-by: Thomas Gleixner 
---
 include/linux/cpuhotplug.h |7 +
 kernel/cpu.c   |4 +++
 kernel/relay.c |   59 ++---
 3 files changed, 25 insertions(+), 45 deletions(-)

Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -18,6 +18,7 @@ enum cpuhp_states {
CPUHP_PROFILE_PREPARE,
CPUHP_X2APIC_PREPARE,
CPUHP_SMPCFD_PREPARE,
+   CPUHP_RELAY_PREPARE,
CPUHP_NOTIFY_PREPARE,
CPUHP_NOTIFY_DEAD,
CPUHP_CLOCKEVENTS_DEAD,
@@ -204,4 +205,10 @@ int profile_online_cpu(unsigned int cpu)
 int smpcfd_prepare_cpu(unsigned int cpu);
 int smpcfd_dead_cpu(unsigned int cpu);
 
+#ifdef CONFIG_RELAY
+int relay_prepare_cpu(unsigned int cpu);
+#else
+#define relay_prepare_cpu  NULL
+#endif
+
 #endif
Index: linux-2.6/kernel/cpu.c
===
--- linux-2.6.orig/kernel/cpu.c
+++ linux-2.6/kernel/cpu.c
@@ -768,6 +768,10 @@ static struct cpuhp_step cpuhp_bp_states
.startup = smpcfd_prepare_cpu,
.teardown = smpcfd_dead_cpu,
},
+   [CPUHP_RELAY_PREPARE] = {
+   .startup = relay_prepare_cpu,
+   .teardown = NULL,
+   },
[CPUHP_NOTIFY_PREPARE] = {
.startup = notify_prepare,
.teardown = NULL,
Index: linux-2.6/kernel/relay.c
===
--- linux-2.6.orig/kernel/relay.c
+++ linux-2.6/kernel/relay.c
@@ -508,46 +508,24 @@ static void setup_callbacks(struct rchan
chan->cb = cb;
 }
 
-/**
- * relay_hotcpu_callback - CPU hotplug callback
- * @nb: notifier block
- * @action: hotplug action to take
- * @hcpu: CPU number
- *
- * Returns the success/failure of the operation. (%NOTIFY_OK, %NOTIFY_BAD)
- */
-static int __cpuinit relay_hotcpu_callback(struct notifier_block *nb,
-   unsigned long action,
-   void *hcpu)
+int __cpuinit relay_prepare_cpu(unsigned int cpu)
 {
-   unsigned int hotcpu = (unsigned long)hcpu;
struct rchan *chan;
 
-   switch(action) {
-   case CPU_UP_PREPARE:
-   case CPU_UP_PREPARE_FROZEN:
-   mutex_lock(&relay_channels_mutex);
-   list_for_each_entry(chan, &relay_channels, list) {
-   if (chan->buf[hotcpu])
-   continue;
-   chan->buf[hotcpu] = relay_open_buf(chan, hotcpu);
-   if(!chan->buf[hotcpu]) {
-   printk(KERN_ERR
-   "relay_hotcpu_callback: cpu %d buffer "
-   "creation failed\n", hotcpu);
-   mutex_unlock(&relay_channels_mutex);
-   return notifier_from_errno(-ENOMEM);
-   }
-   }
-   mutex_unlock(&relay_channels_mutex);
-   break;
-   case CPU_DEAD:
-   case CPU_DEAD_FROZEN:
-   /* No need to flush the cpu : will be flushed upon
-* final relay_flush() call. */
-   break;
+   mutex_lock(&relay_channels_mutex);
+   list_for_each_entry(chan, &relay_channels, list) {
+   if (chan->buf[cpu])
+   continue;
+   chan->buf[cpu] = relay_open_buf(chan, cpu);
+   if(!chan->buf[cpu]) {
+   pr_err("relay: cpu %d buffer creation failed\n", cpu);
+   mutex_unlock(&relay_channels_mutex);
+   return -ENOMEM;
+   }
}
-   return NOTIFY_OK;
+
+   mutex_unlock(&relay_channels_mutex);
+   return 0;
 }
 
 /**
@@ -1355,12 +1333,3 @@ const struct file_operations relay_file_
.splice_read= relay_file_splice_read,
 };
 EXPORT_SYMBOL_GPL(relay_file_operations);
-
-static __init int relay_init(void)
-{
-
-   hotcpu_notifier(relay_hotcpu_callback, 0);
-   return 0;
-}
-
-early_initcall(relay_init);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 40/40] slab: Convert to hotplug state machine

2013-01-31 Thread Thomas Gleixner

From: Richard Weinberger 

Signed-off-by: Richard Weinberger 
Signed-off-by: Thomas Gleixner 
---
 include/linux/cpuhotplug.h |   15 ++
 kernel/cpu.c   |8 +++
 mm/slab.c  |  102 ++---
 3 files changed, 64 insertions(+), 61 deletions(-)

Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -19,6 +19,7 @@ enum cpuhp_states {
CPUHP_X2APIC_PREPARE,
CPUHP_SMPCFD_PREPARE,
CPUHP_RELAY_PREPARE,
+   CPUHP_SLAB_PREPARE,
CPUHP_NOTIFY_PREPARE,
CPUHP_NOTIFY_DEAD,
CPUHP_CLOCKEVENTS_DEAD,
@@ -49,6 +50,7 @@ enum cpuhp_states {
CPUHP_WORKQUEUE_ONLINE,
CPUHP_CPUFREQ_ONLINE,
CPUHP_RCUTREE_ONLINE,
+   CPUHP_SLAB_ONLINE,
CPUHP_NOTIFY_ONLINE,
CPUHP_PROFILE_ONLINE,
CPUHP_NOTIFY_DOWN_PREPARE,
@@ -211,4 +213,17 @@ int relay_prepare_cpu(unsigned int cpu);
 #define relay_prepare_cpu  NULL
 #endif
 
+/* slab hotplug events */
+#if defined(CONFIG_SLAB) && defined(CONFIG_SMP)
+int slab_prepare_cpu(unsigned int cpu);
+int slab_online_cpu(unsigned int cpu);
+int slab_offline_cpu(unsigned int cpu);
+int slab_dead_cpu(unsigned int cpu);
+#else
+#define slab_prepare_cpu   NULL
+#define slab_online_cpuNULL
+#define slab_offline_cpu   NULL
+#define slab_dead_cpu  NULL
+#endif
+
 #endif
Index: linux-2.6/kernel/cpu.c
===
--- linux-2.6.orig/kernel/cpu.c
+++ linux-2.6/kernel/cpu.c
@@ -772,6 +772,10 @@ static struct cpuhp_step cpuhp_bp_states
.startup = relay_prepare_cpu,
.teardown = NULL,
},
+   [CPUHP_SLAB_PREPARE] = {
+   .startup = slab_prepare_cpu,
+   .teardown = slab_dead_cpu,
+   },
[CPUHP_NOTIFY_PREPARE] = {
.startup = notify_prepare,
.teardown = NULL,
@@ -820,6 +824,10 @@ static struct cpuhp_step cpuhp_bp_states
.startup = profile_online_cpu,
.teardown = NULL,
},
+   [CPUHP_SLAB_ONLINE] = {
+   .startup = slab_online_cpu,
+   .teardown = slab_offline_cpu,
+   },
[CPUHP_NOTIFY_DOWN_PREPARE] = {
.startup = NULL,
.teardown = notify_down_prepare,
Index: linux-2.6/mm/slab.c
===
--- linux-2.6.orig/mm/slab.c
+++ linux-2.6/mm/slab.c
@@ -1426,65 +1426,51 @@ bad:
return -ENOMEM;
 }
 
-static int __cpuinit cpuup_callback(struct notifier_block *nfb,
-   unsigned long action, void *hcpu)
+int __cpuinit slab_prepare_cpu(unsigned int cpu)
 {
-   long cpu = (long)hcpu;
-   int err = 0;
+   int err;
 
-   switch (action) {
-   case CPU_UP_PREPARE:
-   case CPU_UP_PREPARE_FROZEN:
-   mutex_lock(&slab_mutex);
-   err = cpuup_prepare(cpu);
-   mutex_unlock(&slab_mutex);
-   break;
-   case CPU_ONLINE:
-   case CPU_ONLINE_FROZEN:
-   start_cpu_timer(cpu);
-   break;
-#ifdef CONFIG_HOTPLUG_CPU
-   case CPU_DOWN_PREPARE:
-   case CPU_DOWN_PREPARE_FROZEN:
-   /*
-* Shutdown cache reaper. Note that the slab_mutex is
-* held so that if cache_reap() is invoked it cannot do
-* anything expensive but will only modify reap_work
-* and reschedule the timer.
-   */
-   cancel_delayed_work_sync(&per_cpu(slab_reap_work, cpu));
-   /* Now the cache_reaper is guaranteed to be not running. */
-   per_cpu(slab_reap_work, cpu).work.func = NULL;
-   break;
-   case CPU_DOWN_FAILED:
-   case CPU_DOWN_FAILED_FROZEN:
-   start_cpu_timer(cpu);
-   break;
-   case CPU_DEAD:
-   case CPU_DEAD_FROZEN:
-   /*
-* Even if all the cpus of a node are down, we don't free the
-* kmem_list3 of any cache. This to avoid a race between
-* cpu_down, and a kmalloc allocation from another cpu for
-* memory from the node of the cpu going down.  The list3
-* structure is usually allocated from kmem_cache_create() and
-* gets destroyed at kmem_cache_destroy().
-*/
-   /* fall through */
-#endif
-   case CPU_UP_CANCELED:
-   case CPU_UP_CANCELED_FROZEN:
-   mutex_lock(&slab_mutex);
-   cpuup_canceled(cpu);
-   mutex_unlock(&slab_mutex);
-   break;
-   }
-   return notifier_from_errno(err);
+   mutex_lock(&slab_mutex);
+   err = cpu

[patch 04/40] cpu: Restructure FROZEN state handling

2013-01-31 Thread Thomas Gleixner

There are only a few callbacks which really care about FROZEN
vs. !FROZEN. No need to have extra states for this. 

Publish the frozen state in an extra variable which is updated under
the hotplug lock and let the users interested deal with it w/o
imposing that extra state checks on everyone.

Signed-off-by: Thomas Gleixner 
---
 kernel/cpu.c |   66 ---
 1 file changed, 27 insertions(+), 39 deletions(-)

Index: linux-2.6/kernel/cpu.c
===
--- linux-2.6.orig/kernel/cpu.c
+++ linux-2.6/kernel/cpu.c
@@ -25,6 +25,7 @@
 #ifdef CONFIG_SMP
 /* Serializes the updates to cpu_online_mask, cpu_present_mask */
 static DEFINE_MUTEX(cpu_add_remove_lock);
+static bool cpuhp_tasks_frozen;
 
 /*
  * The following two API's must be used when attempting
@@ -148,27 +149,30 @@ int __ref register_cpu_notifier(struct n
return ret;
 }
 
-static int __cpu_notify(unsigned long val, void *v, int nr_to_call,
+static int __cpu_notify(unsigned long val, unsigned int cpu, int nr_to_call,
int *nr_calls)
 {
+   unsigned long mod = cpuhp_tasks_frozen ? CPU_TASKS_FROZEN : 0;
+   void *hcpu = (void *)(long)cpu;
+
int ret;
 
-   ret = __raw_notifier_call_chain(&cpu_chain, val, v, nr_to_call,
+   ret = __raw_notifier_call_chain(&cpu_chain, val | mod, hcpu, nr_to_call,
nr_calls);
 
return notifier_to_errno(ret);
 }
 
-static int cpu_notify(unsigned long val, void *v)
+static int cpu_notify(unsigned long val, unsigned int cpu)
 {
-   return __cpu_notify(val, v, -1, NULL);
+   return __cpu_notify(val, cpu, -1, NULL);
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
 
-static void cpu_notify_nofail(unsigned long val, void *v)
+static void cpu_notify_nofail(unsigned long val, unsigned int cpu)
 {
-   BUG_ON(cpu_notify(val, v));
+   BUG_ON(cpu_notify(val, cpu));
 }
 EXPORT_SYMBOL(register_cpu_notifier);
 
@@ -237,23 +241,17 @@ static inline void check_for_tasks(int c
write_unlock_irq(&tasklist_lock);
 }
 
-struct take_cpu_down_param {
-   unsigned long mod;
-   void *hcpu;
-};
-
 /* Take this CPU down. */
 static int __ref take_cpu_down(void *_param)
 {
-   struct take_cpu_down_param *param = _param;
-   int err;
+   int err, cpu = smp_processor_id();
 
/* Ensure this CPU doesn't handle any more interrupts. */
err = __cpu_disable();
if (err < 0)
return err;
 
-   cpu_notify(CPU_DYING | param->mod, param->hcpu);
+   cpu_notify(CPU_DYING, cpu);
/* Park the stopper thread */
kthread_park(current);
return 0;
@@ -263,12 +261,6 @@ static int __ref take_cpu_down(void *_pa
 static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
 {
int err, nr_calls = 0;
-   void *hcpu = (void *)(long)cpu;
-   unsigned long mod = tasks_frozen ? CPU_TASKS_FROZEN : 0;
-   struct take_cpu_down_param tcd_param = {
-   .mod = mod,
-   .hcpu = hcpu,
-   };
 
if (num_online_cpus() == 1)
return -EBUSY;
@@ -278,21 +270,23 @@ static int __ref _cpu_down(unsigned int 
 
cpu_hotplug_begin();
 
-   err = __cpu_notify(CPU_DOWN_PREPARE | mod, hcpu, -1, &nr_calls);
+   cpuhp_tasks_frozen = tasks_frozen;
+
+   err = __cpu_notify(CPU_DOWN_PREPARE, cpu, -1, &nr_calls);
if (err) {
nr_calls--;
-   __cpu_notify(CPU_DOWN_FAILED | mod, hcpu, nr_calls, NULL);
+   __cpu_notify(CPU_DOWN_FAILED, cpu, nr_calls, NULL);
printk("%s: attempt to take down CPU %u failed\n",
__func__, cpu);
goto out_release;
}
smpboot_park_threads(cpu);
 
-   err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu));
+   err = __stop_machine(take_cpu_down, NULL, cpumask_of(cpu));
if (err) {
/* CPU didn't die: tell everyone.  Can't complain. */
smpboot_unpark_threads(cpu);
-   cpu_notify_nofail(CPU_DOWN_FAILED | mod, hcpu);
+   cpu_notify_nofail(CPU_DOWN_FAILED, cpu);
goto out_release;
}
BUG_ON(cpu_online(cpu));
@@ -311,14 +305,14 @@ static int __ref _cpu_down(unsigned int 
__cpu_die(cpu);
 
/* CPU is completely dead: tell everyone.  Too late to complain. */
-   cpu_notify_nofail(CPU_DEAD | mod, hcpu);
+   cpu_notify_nofail(CPU_DEAD, cpu);
 
check_for_tasks(cpu);
 
 out_release:
cpu_hotplug_done();
if (!err)
-   cpu_notify_nofail(CPU_POST_DEAD | mod, hcpu);
+   cpu_notify_nofail(CPU_POST_DEAD, cpu);
return err;
 }
 
@@ -345,10 +339,8 @@ EXPORT_SYMBOL(cpu_down);
 /* Requires cpu_add_remove_lock to be held */
 static int __cpuinit _cpu_up(u

[patch 24/40] arm64: Convert generic timers to hotplug state machine

2013-01-31 Thread Thomas Gleixner

Straight forward replacement.

Signed-off-by: Thomas Gleixner 
---
 drivers/clocksource/arm_generic.c |   40 +++---
 include/linux/cpuhotplug.h|1 
 2 files changed, 13 insertions(+), 28 deletions(-)

Index: linux-2.6/drivers/clocksource/arm_generic.c
===
--- linux-2.6.orig/drivers/clocksource/arm_generic.c
+++ linux-2.6/drivers/clocksource/arm_generic.c
@@ -91,8 +91,10 @@ static int arch_timer_set_next_event(uns
return 0;
 }
 
-static void __cpuinit arch_timer_setup(struct clock_event_device *clk)
+static int __cpuinit arch_timer_cpu_starting(unsigned int cpu)
 {
+   struct clock_event_device *clk = per_cpu_ptr(&arch_timer_evt, cpu);
+
/* Let's make sure the timer is off before doing anything else */
arch_timer_stop();
 
@@ -157,34 +159,17 @@ unsigned long long notrace sched_clock(v
return arch_counter_get_cntvct() * sched_clock_mult;
 }
 
-static int __cpuinit arch_timer_cpu_notify(struct notifier_block *self,
-  unsigned long action, void *hcpu)
+
+static int __cpuinit arch_timer_dying_cpu(unsigned int cpu)
 {
-   int cpu = (long)hcpu;
struct clock_event_device *clk = per_cpu_ptr(&arch_timer_evt, cpu);
 
-   switch(action) {
-   case CPU_STARTING:
-   case CPU_STARTING_FROZEN:
-   arch_timer_setup(clk);
-   break;
-
-   case CPU_DYING:
-   case CPU_DYING_FROZEN:
-   pr_debug("arch_timer_teardown disable IRQ%d cpu #%d\n",
-clk->irq, cpu);
-   disable_percpu_irq(clk->irq);
-   arch_timer_set_mode(CLOCK_EVT_MODE_UNUSED, clk);
-   break;
-   }
-
-   return NOTIFY_OK;
+   pr_debug("arch_timer_teardown disable IRQ%d cpu #%d\n", clk->irq, cpu);
+   disable_percpu_irq(clk->irq);
+   arch_timer_set_mode(CLOCK_EVT_MODE_UNUSED, clk);
+   return 0;
 }
 
-static struct notifier_block __cpuinitdata arch_timer_cpu_nb = {
-   .notifier_call = arch_timer_cpu_notify,
-};
-
 static const struct of_device_id arch_timer_of_match[] __initconst = {
{ .compatible = "arm,armv8-timer" },
{},
@@ -223,10 +208,9 @@ int __init arm_generic_timer_init(void)
/* Calibrate the delay loop directly */
lpj_fine = DIV_ROUND_CLOSEST(arch_timer_rate, HZ);
 
-   /* Immediately configure the timer on the boot CPU */
-   arch_timer_setup(this_cpu_ptr(&arch_timer_evt));
-
-   register_cpu_notifier(&arch_timer_cpu_nb);
+   /* Register and immediately configure the timer on the boot CPU */
+   return cpuhp_setup_state(CPUHP_AP_ARM64_TIMER_STARTING,
+arch_timer_starting_cpu, arch_timer_dying_cpu);
 
return 0;
 }
Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -22,6 +22,7 @@ enum cpuhp_states {
CPUHP_AP_PERF_X86_UNCORE_STARTING,
CPUHP_AP_PERF_X86_AMD_IBS_STARTING,
CPUHP_AP_PERF_X86_STARTING,
+   CPUHP_AP_ARM64_TIMER_STARTING,
CPUHP_AP_NOTIFY_STARTING,
CPUHP_AP_NOTIFY_DYING,
CPUHP_AP_SCHED_MIGRATE_DYING,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 23/40] cpufreq: Convert to hotplug state machine

2013-01-31 Thread Thomas Gleixner

Straight forward conversion to state machine callbacks w/o fixing the
obvious brokeness of the asymetric state invocations.

Signed-off-by: Thomas Gleixner 
---
 drivers/cpufreq/cpufreq_stats.c |   55 +---
 include/linux/cpuhotplug.h  |2 +
 2 files changed, 15 insertions(+), 42 deletions(-)

Index: linux-2.6/drivers/cpufreq/cpufreq_stats.c
===
--- linux-2.6.orig/drivers/cpufreq/cpufreq_stats.c
+++ linux-2.6/drivers/cpufreq/cpufreq_stats.c
@@ -167,7 +167,7 @@ static int freq_table_get_index(struct c
 /* should be called late in the CPU removal sequence so that the stats
  * memory is still available in case someone tries to use it.
  */
-static void cpufreq_stats_free_table(unsigned int cpu)
+static int cpufreq_stats_free_table(unsigned int cpu)
 {
struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, cpu);
if (stat) {
@@ -175,18 +175,20 @@ static void cpufreq_stats_free_table(uns
kfree(stat);
}
per_cpu(cpufreq_stats_table, cpu) = NULL;
+   return 0;
 }
 
 /* must be called early in the CPU removal sequence (before
  * cpufreq_remove_dev) so that policy is still valid.
  */
-static void cpufreq_stats_free_sysfs(unsigned int cpu)
+static int cpufreq_stats_free_sysfs(unsigned int cpu)
 {
struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
if (policy && policy->cpu == cpu)
sysfs_remove_group(&policy->kobj, &stats_attr_group);
if (policy)
cpufreq_cpu_put(policy);
+   return 0;
 }
 
 static int cpufreq_stats_create_table(struct cpufreq_policy *policy,
@@ -316,35 +318,6 @@ static int cpufreq_stat_notifier_trans(s
return 0;
 }
 
-static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
-  unsigned long action,
-  void *hcpu)
-{
-   unsigned int cpu = (unsigned long)hcpu;
-
-   switch (action) {
-   case CPU_ONLINE:
-   case CPU_ONLINE_FROZEN:
-   cpufreq_update_policy(cpu);
-   break;
-   case CPU_DOWN_PREPARE:
-   case CPU_DOWN_PREPARE_FROZEN:
-   cpufreq_stats_free_sysfs(cpu);
-   break;
-   case CPU_DEAD:
-   case CPU_DEAD_FROZEN:
-   cpufreq_stats_free_table(cpu);
-   break;
-   }
-   return NOTIFY_OK;
-}
-
-/* priority=1 so this will get called before cpufreq_remove_dev */
-static struct notifier_block cpufreq_stat_cpu_notifier __refdata = {
-   .notifier_call = cpufreq_stat_cpu_callback,
-   .priority = 1,
-};
-
 static struct notifier_block notifier_policy_block = {
.notifier_call = cpufreq_stat_notifier_policy
 };
@@ -364,18 +337,19 @@ static int __init cpufreq_stats_init(voi
if (ret)
return ret;
 
-   register_hotcpu_notifier(&cpufreq_stat_cpu_notifier);
-   for_each_online_cpu(cpu)
-   cpufreq_update_policy(cpu);
+   /* Install callbacks. Core will call them for each online cpu */
+   cpuhp_setup_state(CPUHP_CPUFREQ_DEAD, NULL, cpufreq_stats_free_table);
+   /* CHECKME: This is pretty broken versus failures in up/down! */
+   cpuhp_setup_state(CPUHP_CPUFREQ_ONLINE, cpufreq_update_policy,
+ cpufreq_stats_free_sysfs);
 
ret = cpufreq_register_notifier(¬ifier_trans_block,
CPUFREQ_TRANSITION_NOTIFIER);
if (ret) {
cpufreq_unregister_notifier(¬ifier_policy_block,
CPUFREQ_POLICY_NOTIFIER);
-   unregister_hotcpu_notifier(&cpufreq_stat_cpu_notifier);
-   for_each_online_cpu(cpu)
-   cpufreq_stats_free_table(cpu);
+   cpuhp_uninstall_callbacks(cpufreq_stats_cbs,
+ ARRAY_SIZE(cpufreq_stats_cbs));
return ret;
}
 
@@ -389,11 +363,8 @@ static void __exit cpufreq_stats_exit(vo
CPUFREQ_POLICY_NOTIFIER);
cpufreq_unregister_notifier(¬ifier_trans_block,
CPUFREQ_TRANSITION_NOTIFIER);
-   unregister_hotcpu_notifier(&cpufreq_stat_cpu_notifier);
-   for_each_online_cpu(cpu) {
-   cpufreq_stats_free_table(cpu);
-   cpufreq_stats_free_sysfs(cpu);
-   }
+   cpuhp_uninstall_callbacks(cpufreq_stats_cbs,
+ ARRAY_SIZE(cpufreq_stats_cbs));
 }
 
 MODULE_AUTHOR("Zou Nan hai ");
Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -14,6 +14,7 @@ enum cpuhp_states {
CPUHP_WORKQUEUE_PREP,
CPUHP_NOTIFY_PREPARE,
CPUHP_NOTIFY_DEAD,
+

[patch 27/40] virt: Convert kvm hotplug to state machine

2013-01-31 Thread Thomas Gleixner

Signed-off-by: Thomas Gleixner 
---
 include/linux/cpuhotplug.h |1 +
 virt/kvm/kvm_main.c|   42 --
 2 files changed, 17 insertions(+), 26 deletions(-)

Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -25,6 +25,7 @@ enum cpuhp_states {
CPUHP_AP_PERF_ARM_STARTING,
CPUHP_AP_ARM_VFP_STARTING,
CPUHP_AP_ARM64_TIMER_STARTING,
+   CPUHP_AP_KVM_STARTING,
CPUHP_AP_NOTIFY_STARTING,
CPUHP_AP_NOTIFY_DYING,
CPUHP_AP_SCHED_MIGRATE_DYING,
Index: linux-2.6/virt/kvm/kvm_main.c
===
--- linux-2.6.orig/virt/kvm/kvm_main.c
+++ linux-2.6/virt/kvm/kvm_main.c
@@ -2496,30 +2496,23 @@ static int hardware_enable_all(void)
return r;
 }
 
-static int kvm_cpu_hotplug(struct notifier_block *notifier, unsigned long val,
-  void *v)
+static int kvm_starting_cpu(unsigned int cpu)
 {
-   int cpu = (long)v;
-
-   if (!kvm_usage_count)
-   return NOTIFY_OK;
-
-   val &= ~CPU_TASKS_FROZEN;
-   switch (val) {
-   case CPU_DYING:
-   printk(KERN_INFO "kvm: disabling virtualization on CPU%d\n",
-  cpu);
-   hardware_disable(NULL);
-   break;
-   case CPU_STARTING:
-   printk(KERN_INFO "kvm: enabling virtualization on CPU%d\n",
-  cpu);
+   if (kvm_usage_count) {
+   pr_info("kvm: enabling virtualization on CPU%u\n", cpu);
hardware_enable(NULL);
-   break;
}
-   return NOTIFY_OK;
+   return 0;
 }
 
+static int kvm_dying_cpu(unsigned int cpu)
+{
+   if (kvm_usage_count) {
+   pr_info("kvm: disabling virtualization on CPU%u\n", cpu);
+   hardware_disable(NULL);
+   }
+   return 0;
+}
 
 asmlinkage void kvm_spurious_fault(void)
 {
@@ -2725,10 +2718,6 @@ int kvm_io_bus_unregister_dev(struct kvm
return r;
 }
 
-static struct notifier_block kvm_cpu_notifier = {
-   .notifier_call = kvm_cpu_hotplug,
-};
-
 static int vm_stat_get(void *_offset, u64 *val)
 {
unsigned offset = (long)_offset;
@@ -2870,7 +2859,8 @@ int kvm_init(void *opaque, unsigned vcpu
goto out_free_1;
}
 
-   r = register_cpu_notifier(&kvm_cpu_notifier);
+   r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_STARTING, kvm_starting_cpu,
+ kvm_dying_cpu);
if (r)
goto out_free_2;
register_reboot_notifier(&kvm_reboot_notifier);
@@ -2920,7 +2910,7 @@ out_free:
kmem_cache_destroy(kvm_vcpu_cache);
 out_free_3:
unregister_reboot_notifier(&kvm_reboot_notifier);
-   unregister_cpu_notifier(&kvm_cpu_notifier);
+   cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING);
 out_free_2:
 out_free_1:
kvm_arch_hardware_unsetup();
@@ -2941,7 +2931,7 @@ void kvm_exit(void)
kvm_async_pf_deinit();
unregister_syscore_ops(&kvm_syscore_ops);
unregister_reboot_notifier(&kvm_reboot_notifier);
-   unregister_cpu_notifier(&kvm_cpu_notifier);
+   cpuhp_remove_state_nocalls(CPUHP_AP_KVM_STARTING);
on_each_cpu(hardware_disable_nolock, NULL, 1);
kvm_arch_hardware_unsetup();
kvm_arch_exit();


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 30/40] x86: tboot: Convert to hotplug state machine

2013-01-31 Thread Thomas Gleixner

Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/tboot.c|   23 +++
 include/linux/cpuhotplug.h |1 +
 2 files changed, 8 insertions(+), 16 deletions(-)

Index: linux-2.6/arch/x86/kernel/tboot.c
===
--- linux-2.6.orig/arch/x86/kernel/tboot.c
+++ linux-2.6/arch/x86/kernel/tboot.c
@@ -319,25 +319,16 @@ static int tboot_wait_for_aps(int num_ap
return !(atomic_read((atomic_t *)&tboot->num_in_wfs) == num_aps);
 }
 
-static int __cpuinit tboot_cpu_callback(struct notifier_block *nfb,
-   unsigned long action, void *hcpu)
+static int __cpuinit tboot_dying_cpu(unsigned int cpu)
 {
-   switch (action) {
-   case CPU_DYING:
-   atomic_inc(&ap_wfs_count);
-   if (num_online_cpus() == 1)
-   if (tboot_wait_for_aps(atomic_read(&ap_wfs_count)))
-   return NOTIFY_BAD;
-   break;
+   atomic_inc(&ap_wfs_count);
+   if (num_online_cpus() == 1) {
+   if (tboot_wait_for_aps(atomic_read(&ap_wfs_count)))
+   return -EBUSY;
}
-   return NOTIFY_OK;
+   return 0;
 }
 
-static struct notifier_block tboot_cpu_notifier __cpuinitdata =
-{
-   .notifier_call = tboot_cpu_callback,
-};
-
 static __init int tboot_late_init(void)
 {
if (!tboot_enabled())
@@ -346,7 +337,7 @@ static __init int tboot_late_init(void)
tboot_create_trampoline();
 
atomic_set(&ap_wfs_count, 0);
-   register_hotcpu_notifier(&tboot_cpu_notifier);
+   cpuhp_setup_state(CPUHP_AP_X86_TBOOT_DYING, NULL, tboot_dying_cpu);
 
acpi_os_set_prepare_sleep(&tboot_sleep);
return 0;
Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -27,6 +27,7 @@ enum cpuhp_states {
CPUHP_AP_ARM64_TIMER_STARTING,
CPUHP_AP_KVM_STARTING,
CPUHP_AP_NOTIFY_DYING,
+   CPUHP_AP_X86_TBOOT_DYING,
CPUHP_AP_S390_VTIME_DYING,
CPUHP_AP_SCHED_MIGRATE_DYING,
CPUHP_AP_MAX,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 38/40] smp: Convert core to hotplug state machine

2013-01-31 Thread Thomas Gleixner

From: Richard Weinberger 

Signed-off-by: Richard Weinberger 
Signed-off-by: Thomas Gleixner 
---
 include/linux/cpuhotplug.h |5 
 kernel/cpu.c   |4 +++
 kernel/smp.c   |   50 -
 3 files changed, 27 insertions(+), 32 deletions(-)

Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -17,6 +17,7 @@ enum cpuhp_states {
CPUHP_TIMERS_PREPARE,
CPUHP_PROFILE_PREPARE,
CPUHP_X2APIC_PREPARE,
+   CPUHP_SMPCFD_PREPARE,
CPUHP_NOTIFY_PREPARE,
CPUHP_NOTIFY_DEAD,
CPUHP_CLOCKEVENTS_DEAD,
@@ -199,4 +200,8 @@ int profile_online_cpu(unsigned int cpu)
 #define profile_online_cpu NULL
 #endif
 
+/* SMP core functions */
+int smpcfd_prepare_cpu(unsigned int cpu);
+int smpcfd_dead_cpu(unsigned int cpu);
+
 #endif
Index: linux-2.6/kernel/cpu.c
===
--- linux-2.6.orig/kernel/cpu.c
+++ linux-2.6/kernel/cpu.c
@@ -764,6 +764,10 @@ static struct cpuhp_step cpuhp_bp_states
.startup = profile_prepare_cpu,
.teardown = profile_dead_cpu,
},
+   [CPUHP_SMPCFD_PREPARE] = {
+   .startup = smpcfd_prepare_cpu,
+   .teardown = smpcfd_dead_cpu,
+   },
[CPUHP_NOTIFY_PREPARE] = {
.startup = notify_prepare,
.teardown = NULL,
Index: linux-2.6/kernel/smp.c
===
--- linux-2.6.orig/kernel/smp.c
+++ linux-2.6/kernel/smp.c
@@ -45,45 +45,32 @@ struct call_single_queue {
 
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct call_single_queue, 
call_single_queue);
 
-static int
-hotplug_cfd(struct notifier_block *nfb, unsigned long action, void *hcpu)
+int __cpuinit smpcfd_prepare_cpu(unsigned int cpu)
 {
-   long cpu = (long)hcpu;
struct call_function_data *cfd = &per_cpu(cfd_data, cpu);
 
-   switch (action) {
-   case CPU_UP_PREPARE:
-   case CPU_UP_PREPARE_FROZEN:
-   if (!zalloc_cpumask_var_node(&cfd->cpumask, GFP_KERNEL,
-   cpu_to_node(cpu)))
-   return notifier_from_errno(-ENOMEM);
-   if (!zalloc_cpumask_var_node(&cfd->cpumask_ipi, GFP_KERNEL,
-   cpu_to_node(cpu)))
-   return notifier_from_errno(-ENOMEM);
-   break;
-
-#ifdef CONFIG_HOTPLUG_CPU
-   case CPU_UP_CANCELED:
-   case CPU_UP_CANCELED_FROZEN:
-
-   case CPU_DEAD:
-   case CPU_DEAD_FROZEN:
+   if (!zalloc_cpumask_var_node(&cfd->cpumask, GFP_KERNEL,
+cpu_to_node(cpu)))
+   return -ENOMEM;
+   if (!zalloc_cpumask_var_node(&cfd->cpumask_ipi, GFP_KERNEL,
+cpu_to_node(cpu))) {
free_cpumask_var(cfd->cpumask);
-   free_cpumask_var(cfd->cpumask_ipi);
-   break;
-#endif
-   };
-
-   return NOTIFY_OK;
+   return -ENOMEM;
+   }
+   return;
 }
 
-static struct notifier_block __cpuinitdata hotplug_cfd_notifier = {
-   .notifier_call  = hotplug_cfd,
-};
+int __cpuinit smpcfd_dead_cpu(unsigned int cpu)
+{
+   struct call_function_data *cfd = &per_cpu(cfd_data, cpu);
+
+   free_cpumask_var(cfd->cpumask);
+   free_cpumask_var(cfd->cpumask_ipi);
+   return 0;
+}
 
 void __init call_function_init(void)
 {
-   void *cpu = (void *)(long)smp_processor_id();
int i;
 
for_each_possible_cpu(i) {
@@ -93,8 +80,7 @@ void __init call_function_init(void)
INIT_LIST_HEAD(&q->list);
}
 
-   hotplug_cfd(&hotplug_cfd_notifier, CPU_UP_PREPARE, cpu);
-   register_cpu_notifier(&hotplug_cfd_notifier);
+   smpcfd_prepare_cpu(smp_processor_id());
 }
 
 /*


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 01/40] smpboot: Allow selfparking per cpu threads

2013-01-31 Thread Thomas Gleixner

The stop machine threads are still killed when a cpu goes offline. The
reason is that the thread is used to bring the cpu down, so it can't
be parked along with the other per cpu threads.

Allow a per cpu thread to be excluded from automatic parking, so it
can park itself once it's done

Add a create callback function as well.

Signed-off-by: Thomas Gleixner 
---
 include/linux/smpboot.h |5 +
 kernel/smpboot.c|5 +++--
 2 files changed, 8 insertions(+), 2 deletions(-)

Index: linux-2.6/include/linux/smpboot.h
===
--- linux-2.6.orig/include/linux/smpboot.h
+++ linux-2.6/include/linux/smpboot.h
@@ -14,6 +14,8 @@ struct smpboot_thread_data;
  * @thread_should_run: Check whether the thread should run or not. Called with
  * preemption disabled.
  * @thread_fn: The associated thread function
+ * @create:Optional setup function, called when the thread gets
+ * created (Not called from the thread context)
  * @setup: Optional setup function, called when the thread gets
  * operational the first time
  * @cleanup:   Optional cleanup function, called when the thread
@@ -22,6 +24,7 @@ struct smpboot_thread_data;
  * parked (cpu offline)
  * @unpark:Optional unpark function, called when the thread is
  * unparked (cpu online)
+ * @selfparking:   Thread is not parked by the park function.
  * @thread_comm:   The base name of the thread
  */
 struct smp_hotplug_thread {
@@ -29,10 +32,12 @@ struct smp_hotplug_thread {
struct list_headlist;
int (*thread_should_run)(unsigned int cpu);
void(*thread_fn)(unsigned int cpu);
+   void(*create)(unsigned int cpu);
void(*setup)(unsigned int cpu);
void(*cleanup)(unsigned int cpu, bool 
online);
void(*park)(unsigned int cpu);
void(*unpark)(unsigned int cpu);
+   boolselfparking;
const char  *thread_comm;
 };
 
Index: linux-2.6/kernel/smpboot.c
===
--- linux-2.6.orig/kernel/smpboot.c
+++ linux-2.6/kernel/smpboot.c
@@ -183,9 +183,10 @@ __smpboot_create_thread(struct smp_hotpl
kfree(td);
return PTR_ERR(tsk);
}
-
get_task_struct(tsk);
*per_cpu_ptr(ht->store, cpu) = tsk;
+   if (ht->create)
+   ht->create(cpu);
return 0;
 }
 
@@ -225,7 +226,7 @@ static void smpboot_park_thread(struct s
 {
struct task_struct *tsk = *per_cpu_ptr(ht->store, cpu);
 
-   if (tsk)
+   if (tsk && !ht->selfparking)
kthread_park(tsk);
 }
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 14/40] x86: perf: Convert the core to the hotplug state machine

2013-01-31 Thread Thomas Gleixner

Replace the perf_notifier() install mechanism, which invokes magically
the callback on the current cpu. Convert the hardware specific
callbacks which are invoked from the x86 perf core to return proper
error codes instead of totally pointless NOTIFY_BAD return values.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/cpu/perf_event.c   |   78 ++---
 arch/x86/kernel/cpu/perf_event_amd.c   |6 +-
 arch/x86/kernel/cpu/perf_event_intel.c |6 +-
 include/linux/cpuhotplug.h |3 +
 4 files changed, 52 insertions(+), 41 deletions(-)

Index: linux-2.6/arch/x86/kernel/cpu/perf_event.c
===
--- linux-2.6.orig/arch/x86/kernel/cpu/perf_event.c
+++ linux-2.6/arch/x86/kernel/cpu/perf_event.c
@@ -1252,47 +1252,45 @@ perf_event_nmi_handler(unsigned int cmd,
 struct event_constraint emptyconstraint;
 struct event_constraint unconstrained;
 
-static int __cpuinit
-x86_pmu_notifier(struct notifier_block *self, unsigned long action, void *hcpu)
+static int __cpuinit x86_pmu_prepare_cpu(unsigned int cpu)
 {
-   unsigned int cpu = (long)hcpu;
struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
-   int ret = NOTIFY_OK;
 
-   switch (action & ~CPU_TASKS_FROZEN) {
-   case CPU_UP_PREPARE:
-   cpuc->kfree_on_online = NULL;
-   if (x86_pmu.cpu_prepare)
-   ret = x86_pmu.cpu_prepare(cpu);
-   break;
-
-   case CPU_STARTING:
-   if (x86_pmu.attr_rdpmc)
-   set_in_cr4(X86_CR4_PCE);
-   if (x86_pmu.cpu_starting)
-   x86_pmu.cpu_starting(cpu);
-   break;
+   cpuc->kfree_on_online = NULL;
+   if (x86_pmu.cpu_prepare)
+   return x86_pmu.cpu_prepare(cpu);
+   return 0;
+}
 
-   case CPU_ONLINE:
-   kfree(cpuc->kfree_on_online);
-   break;
+static int __cpuinit x86_pmu_dead_cpu(unsigned int cpu)
+{
+   if (x86_pmu.cpu_dead)
+   x86_pmu.cpu_dead(cpu);
+   return 0;
+}
 
-   case CPU_DYING:
-   if (x86_pmu.cpu_dying)
-   x86_pmu.cpu_dying(cpu);
-   break;
+static int __cpuinit x86_pmu_online_cpu(unsigned int cpu)
+{
+   struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
 
-   case CPU_UP_CANCELED:
-   case CPU_DEAD:
-   if (x86_pmu.cpu_dead)
-   x86_pmu.cpu_dead(cpu);
-   break;
+   kfree(cpuc->kfree_on_online);
+   return 0;
+}
 
-   default:
-   break;
-   }
+static int __cpuinit x86_pmu_starting_cpu(unsigned int cpu)
+{
+   if (x86_pmu.attr_rdpmc)
+   set_in_cr4(X86_CR4_PCE);
+   if (x86_pmu.cpu_starting)
+   x86_pmu.cpu_starting(cpu);
+   return 0;
+}
 
-   return ret;
+static int __cpuinit x86_pmu_dying_cpu(unsigned int cpu)
+{
+   if (x86_pmu.cpu_dying)
+   x86_pmu.cpu_dying(cpu);
+   return 0;
 }
 
 static void __init pmu_check_apic(void)
@@ -1485,8 +1483,18 @@ static int __init init_hw_perf_events(vo
pr_info("... event mask: %016Lx\n", x86_pmu.intel_ctrl);
 
perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
-   perf_cpu_notifier(x86_pmu_notifier);
-
+   /*
+* Install callbacks. Core will call them for each online
+* cpu.
+*
+* FIXME: This should check the return value, but the original
+* code did not do that either
+*/
+   cpuhp_setup_state(CPUHP_PERF_X86_PREPARE, x86_pmu_prepare_cpu,
+ x86_pmu_dead_cpu);
+   cpuhp_setup_state(CPUHP_AP_PERF_X86_STARTING, x86_pmu_starting_cpu,
+ x86_pmu_dying_cpu);
+   cpuhp_setup_state(CPUHP_PERF_X86_ONLINE, x86_pmu_online_cpu, NULL);
return 0;
 }
 early_initcall(init_hw_perf_events);
Index: linux-2.6/arch/x86/kernel/cpu/perf_event_amd.c
===
--- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_amd.c
+++ linux-2.6/arch/x86/kernel/cpu/perf_event_amd.c
@@ -349,13 +349,13 @@ static int amd_pmu_cpu_prepare(int cpu)
WARN_ON_ONCE(cpuc->amd_nb);
 
if (boot_cpu_data.x86_max_cores < 2)
-   return NOTIFY_OK;
+   return 0;
 
cpuc->amd_nb = amd_alloc_nb(cpu);
if (!cpuc->amd_nb)
-   return NOTIFY_BAD;
+   return -ENOMEM;
 
-   return NOTIFY_OK;
+   return 0;
 }
 
 static void amd_pmu_cpu_starting(int cpu)
Index: linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c
===
--- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_intel.c
+++ linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1662,13 +1662,13 @@ static int intel_pmu_cpu_prepare(int cpu

[patch 36/40] profile: Convert ot hotplug state machine

2013-01-31 Thread Thomas Gleixner

From: Richard Weinberger 

Signed-off-by: Richard Weinberger 
---
 include/linux/cpuhotplug.h |   12 +
 kernel/cpu.c   |8 +++
 kernel/profile.c   |   92 +
 3 files changed, 63 insertions(+), 49 deletions(-)

Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -15,6 +15,7 @@ enum cpuhp_states {
CPUHP_RCUTREE_PREPARE,
CPUHP_HRTIMERS_PREPARE,
CPUHP_TIMERS_PREPARE,
+   CPUHP_PROFILE_PREPARE,
CPUHP_NOTIFY_PREPARE,
CPUHP_NOTIFY_DEAD,
CPUHP_CLOCKEVENTS_DEAD,
@@ -46,6 +47,7 @@ enum cpuhp_states {
CPUHP_CPUFREQ_ONLINE,
CPUHP_RCUTREE_ONLINE,
CPUHP_NOTIFY_ONLINE,
+   CPUHP_PROFILE_ONLINE,
CPUHP_NOTIFY_DOWN_PREPARE,
CPUHP_PERF_X86_UNCORE_ONLINE,
CPUHP_PERF_X86_ONLINE,
@@ -186,4 +188,14 @@ int timers_dead_cpu(unsigned int cpu);
 #define timers_dead_cpuNULL
 #endif
 
+#if defined(CONFIG_PROFILING) && defined(CONFIG_HOTPLUG_CPU)
+int profile_prepare_cpu(unsigned int cpu);
+int profile_dead_cpu(unsigned int cpu);
+int profile_online_cpu(unsigned int cpu);
+#else
+#define profile_prepare_cpuNULL
+#define profile_dead_cpu   NULL
+#define profile_online_cpu NULL
+#endif
+
 #endif
Index: linux-2.6/kernel/cpu.c
===
--- linux-2.6.orig/kernel/cpu.c
+++ linux-2.6/kernel/cpu.c
@@ -760,6 +760,10 @@ static struct cpuhp_step cpuhp_bp_states
.startup = timers_prepare_cpu,
.teardown = timers_dead_cpu,
},
+   [CPUHP_PROFILE_PREPARE] = {
+   .startup = profile_prepare_cpu,
+   .teardown = profile_dead_cpu,
+   },
[CPUHP_NOTIFY_PREPARE] = {
.startup = notify_prepare,
.teardown = NULL,
@@ -804,6 +808,10 @@ static struct cpuhp_step cpuhp_bp_states
.startup = notify_online,
.teardown = NULL,
},
+   [CPUHP_PROFILE_ONLINE] = {
+   .startup = profile_online_cpu,
+   .teardown = NULL,
+   },
[CPUHP_NOTIFY_DOWN_PREPARE] = {
.startup = NULL,
.teardown = notify_down_prepare,
Index: linux-2.6/kernel/profile.c
===
--- linux-2.6.orig/kernel/profile.c
+++ linux-2.6/kernel/profile.c
@@ -353,68 +353,63 @@ out:
put_cpu();
 }
 
-static int __cpuinit profile_cpu_callback(struct notifier_block *info,
-   unsigned long action, void *__cpu)
+int __cpuinit profile_dead_cpu(unsigned int cpu)
 {
-   int node, cpu = (unsigned long)__cpu;
struct page *page;
+   int i;
 
-   switch (action) {
-   case CPU_UP_PREPARE:
-   case CPU_UP_PREPARE_FROZEN:
-   node = cpu_to_mem(cpu);
-   per_cpu(cpu_profile_flip, cpu) = 0;
-   if (!per_cpu(cpu_profile_hits, cpu)[1]) {
-   page = alloc_pages_exact_node(node,
-   GFP_KERNEL | __GFP_ZERO,
-   0);
-   if (!page)
-   return notifier_from_errno(-ENOMEM);
-   per_cpu(cpu_profile_hits, cpu)[1] = page_address(page);
-   }
-   if (!per_cpu(cpu_profile_hits, cpu)[0]) {
-   page = alloc_pages_exact_node(node,
-   GFP_KERNEL | __GFP_ZERO,
-   0);
-   if (!page)
-   goto out_free;
-   per_cpu(cpu_profile_hits, cpu)[0] = page_address(page);
-   }
-   break;
-out_free:
-   page = virt_to_page(per_cpu(cpu_profile_hits, cpu)[1]);
-   per_cpu(cpu_profile_hits, cpu)[1] = NULL;
-   __free_page(page);
-   return notifier_from_errno(-ENOMEM);
-   case CPU_ONLINE:
-   case CPU_ONLINE_FROZEN:
-   if (prof_cpu_mask != NULL)
-   cpumask_set_cpu(cpu, prof_cpu_mask);
-   break;
-   case CPU_UP_CANCELED:
-   case CPU_UP_CANCELED_FROZEN:
-   case CPU_DEAD:
-   case CPU_DEAD_FROZEN:
-   if (prof_cpu_mask != NULL)
-   cpumask_clear_cpu(cpu, prof_cpu_mask);
-   if (per_cpu(cpu_profile_hits, cpu)[0]) {
-   page = virt_to_page(per_cpu(cpu_profile_hits, cpu)[0]);
-   per_cpu(cpu_profile_hits, cpu)[0] = NULL;
+   if (prof_cpu_mask != NULL)
+   cpumask_clear_cpu(cpu, prof_cpu_mask);
+
+   for (i = 0; i < 2; i++) {
+   if (per_cpu(cpu_profile_hits, cpu)[i]) {
+

[patch 37/40] x86: x2apic: Convert to cpu hotplug state machine

2013-01-31 Thread Thomas Gleixner

From: Richard Weinberger 

Signed-off-by: Richard Weinberger 
Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/apic/x2apic_cluster.c |   80 --
 include/linux/cpuhotplug.h|1 
 2 files changed, 31 insertions(+), 50 deletions(-)

Index: linux-2.6/arch/x86/kernel/apic/x2apic_cluster.c
===
--- linux-2.6.orig/arch/x86/kernel/apic/x2apic_cluster.c
+++ linux-2.6/arch/x86/kernel/apic/x2apic_cluster.c
@@ -145,68 +145,48 @@ static void init_x2apic_ldr(void)
}
 }
 
- /*
-  * At CPU state changes, update the x2apic cluster sibling info.
-  */
-static int __cpuinit
-update_clusterinfo(struct notifier_block *nfb, unsigned long action, void 
*hcpu)
+/*
+ * At CPU state changes, update the x2apic cluster sibling info.
+ */
+int __cpuinit x2apic_prepare_cpu(unsigned int cpu)
 {
-   unsigned int this_cpu = (unsigned long)hcpu;
-   unsigned int cpu;
-   int err = 0;
-
-   switch (action) {
-   case CPU_UP_PREPARE:
-   if (!zalloc_cpumask_var(&per_cpu(cpus_in_cluster, this_cpu),
-   GFP_KERNEL)) {
-   err = -ENOMEM;
-   } else if (!zalloc_cpumask_var(&per_cpu(ipi_mask, this_cpu),
-  GFP_KERNEL)) {
-   free_cpumask_var(per_cpu(cpus_in_cluster, this_cpu));
-   err = -ENOMEM;
-   }
-   break;
-   case CPU_UP_CANCELED:
-   case CPU_UP_CANCELED_FROZEN:
-   case CPU_DEAD:
-   for_each_online_cpu(cpu) {
-   if (x2apic_cluster(this_cpu) != x2apic_cluster(cpu))
-   continue;
-   __cpu_clear(this_cpu, per_cpu(cpus_in_cluster, cpu));
-   __cpu_clear(cpu, per_cpu(cpus_in_cluster, this_cpu));
-   }
-   free_cpumask_var(per_cpu(cpus_in_cluster, this_cpu));
-   free_cpumask_var(per_cpu(ipi_mask, this_cpu));
-   break;
+   if (!zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL))
+   return -ENOMEM;
+
+   if (!zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL)) {
+   free_cpumask_var(per_cpu(cpus_in_cluster, cpu));
+   return -ENOMEM;
}
 
-   return notifier_from_errno(err);
+   return 0;
 }
 
-static struct notifier_block __refdata x2apic_cpu_notifier = {
-   .notifier_call = update_clusterinfo,
-};
-
-static int x2apic_init_cpu_notifier(void)
+int __cpuinit x2apic_dead_cpu(unsigned int this_cpu)
 {
-   int cpu = smp_processor_id();
-
-   zalloc_cpumask_var(&per_cpu(cpus_in_cluster, cpu), GFP_KERNEL);
-   zalloc_cpumask_var(&per_cpu(ipi_mask, cpu), GFP_KERNEL);
+   int cpu;
 
-   BUG_ON(!per_cpu(cpus_in_cluster, cpu) || !per_cpu(ipi_mask, cpu));
-
-   __cpu_set(cpu, per_cpu(cpus_in_cluster, cpu));
-   register_hotcpu_notifier(&x2apic_cpu_notifier);
-   return 1;
+   for_each_online_cpu(cpu) {
+   if (x2apic_cluster(this_cpu) != x2apic_cluster(cpu))
+   continue;
+   __cpu_clear(this_cpu, per_cpu(cpus_in_cluster, cpu));
+   __cpu_clear(cpu, per_cpu(cpus_in_cluster, this_cpu));
+   }
+   free_cpumask_var(per_cpu(cpus_in_cluster, this_cpu));
+   free_cpumask_var(per_cpu(ipi_mask, this_cpu));
+   return 0;
 }
 
 static int x2apic_cluster_probe(void)
 {
-   if (x2apic_mode)
-   return x2apic_init_cpu_notifier();
-   else
+   int cpu = smp_processor_id();
+
+   if (!x2apic_mode)
return 0;
+
+   __cpu_set(cpu, per_cpu(cpus_in_cluster, cpu));
+   cpuhp_setup_state(CPUHP_X2APIC_PREPARE, x2apic_prepare_cpu,
+ x2apic_dead_cpu);
+   return 1;
 }
 
 static const struct cpumask *x2apic_cluster_target_cpus(void)
Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -16,6 +16,7 @@ enum cpuhp_states {
CPUHP_HRTIMERS_PREPARE,
CPUHP_TIMERS_PREPARE,
CPUHP_PROFILE_PREPARE,
+   CPUHP_X2APIC_PREPARE,
CPUHP_NOTIFY_PREPARE,
CPUHP_NOTIFY_DEAD,
CPUHP_CLOCKEVENTS_DEAD,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 35/40] timers: Convert to hotplug state machine

2013-01-31 Thread Thomas Gleixner

From: Richard Weinberger 

Signed-off-by: Richard Weinberger 
Signed-off-by: Thomas Gleixner 
---
 include/linux/cpuhotplug.h |4 
 kernel/cpu.c   |4 
 kernel/timer.c |   43 +--
 3 files changed, 13 insertions(+), 38 deletions(-)

Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -14,6 +14,7 @@ enum cpuhp_states {
CPUHP_WORKQUEUE_PREP,
CPUHP_RCUTREE_PREPARE,
CPUHP_HRTIMERS_PREPARE,
+   CPUHP_TIMERS_PREPARE,
CPUHP_NOTIFY_PREPARE,
CPUHP_NOTIFY_DEAD,
CPUHP_CLOCKEVENTS_DEAD,
@@ -176,10 +177,13 @@ int clockevents_dead_cpu(unsigned int cp
 #endif
 
 int hrtimers_prepare_cpu(unsigned int cpu);
+int timers_prepare_cpu(unsigned int cpu);
 #ifdef CONFIG_HOTPLUG_CPU
 int hrtimers_dead_cpu(unsigned int cpu);
+int timers_dead_cpu(unsigned int cpu);
 #else
 #define hrtimers_dead_cpu  NULL
+#define timers_dead_cpuNULL
 #endif
 
 #endif
Index: linux-2.6/kernel/cpu.c
===
--- linux-2.6.orig/kernel/cpu.c
+++ linux-2.6/kernel/cpu.c
@@ -756,6 +756,10 @@ static struct cpuhp_step cpuhp_bp_states
.startup = hrtimers_prepare_cpu,
.teardown = hrtimers_dead_cpu,
},
+   [CPUHP_TIMERS_PREPARE] = {
+   .startup = timers_prepare_cpu,
+   .teardown = timers_dead_cpu,
+   },
[CPUHP_NOTIFY_PREPARE] = {
.startup = notify_prepare,
.teardown = NULL,
Index: linux-2.6/kernel/timer.c
===
--- linux-2.6.orig/kernel/timer.c
+++ linux-2.6/kernel/timer.c
@@ -1642,7 +1642,7 @@ SYSCALL_DEFINE1(sysinfo, struct sysinfo 
return 0;
 }
 
-static int __cpuinit init_timers_cpu(int cpu)
+int __cpuinit timers_prepare_cpu(unsigned int cpu)
 {
int j;
struct tvec_base *base;
@@ -1714,7 +1714,7 @@ static void migrate_timer_list(struct tv
}
 }
 
-static void __cpuinit migrate_timers(int cpu)
+int __cpuinit timers_dead_cpu(unsigned int cpu)
 {
struct tvec_base *old_base;
struct tvec_base *new_base;
@@ -1744,52 +1744,19 @@ static void __cpuinit migrate_timers(int
spin_unlock(&old_base->lock);
spin_unlock_irq(&new_base->lock);
put_cpu_var(tvec_bases);
-}
-#endif /* CONFIG_HOTPLUG_CPU */
-
-static int __cpuinit timer_cpu_notify(struct notifier_block *self,
-   unsigned long action, void *hcpu)
-{
-   long cpu = (long)hcpu;
-   int err;
 
-   switch(action) {
-   case CPU_UP_PREPARE:
-   case CPU_UP_PREPARE_FROZEN:
-   err = init_timers_cpu(cpu);
-   if (err < 0)
-   return notifier_from_errno(err);
-   break;
-#ifdef CONFIG_HOTPLUG_CPU
-   case CPU_DEAD:
-   case CPU_DEAD_FROZEN:
-   migrate_timers(cpu);
-   break;
-#endif
-   default:
-   break;
-   }
-   return NOTIFY_OK;
+   return 0;
 }
-
-static struct notifier_block __cpuinitdata timers_nb = {
-   .notifier_call  = timer_cpu_notify,
-};
-
+#endif /* CONFIG_HOTPLUG_CPU */
 
 void __init init_timers(void)
 {
-   int err;
-
/* ensure there are enough low bits for flags in timer->base pointer */
BUILD_BUG_ON(__alignof__(struct tvec_base) & TIMER_FLAG_MASK);
 
-   err = timer_cpu_notify(&timers_nb, (unsigned long)CPU_UP_PREPARE,
-  (void *)(long)smp_processor_id());
init_timer_stats();
+   BUG_ON(timers_prepare_cpu(smp_processor_id()));
 
-   BUG_ON(err != NOTIFY_OK);
-   register_cpu_notifier(&timers_nb);
open_softirq(TIMER_SOFTIRQ, run_timer_softirq);
 }
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 34/40] cpuhotplug: Remove CPU_DYING notifier

2013-01-31 Thread Thomas Gleixner

All users gone.

Signed-off-by: Thomas Gleixner 
---
 include/linux/cpu.h|6 --
 include/linux/cpuhotplug.h |1 -
 kernel/cpu.c   |   11 ---
 3 files changed, 18 deletions(-)

Index: linux-2.6/include/linux/cpu.h
===
--- linux-2.6.orig/include/linux/cpu.h
+++ linux-2.6/include/linux/cpu.h
@@ -60,11 +60,6 @@ extern ssize_t arch_print_cpu_modalias(s
 #define CPU_DOWN_PREPARE   0x0005 /* CPU (unsigned)v going down */
 #define CPU_DOWN_FAILED0x0006 /* CPU (unsigned)v NOT going 
down */
 #define CPU_DEAD   0x0007 /* CPU (unsigned)v dead */
-#define CPU_DYING  0x0008 /* CPU (unsigned)v not running any task,
-   * not handling interrupts, soon dead.
-   * Called on the dying cpu, interrupts
-   * are already disabled. Must not
-   * sleep, must not fail */
 #define CPU_POST_DEAD  0x0009 /* CPU (unsigned)v dead, cpu_hotplug
* lock is dropped */
 
@@ -79,7 +74,6 @@ extern ssize_t arch_print_cpu_modalias(s
 #define CPU_DOWN_PREPARE_FROZEN(CPU_DOWN_PREPARE | CPU_TASKS_FROZEN)
 #define CPU_DOWN_FAILED_FROZEN (CPU_DOWN_FAILED | CPU_TASKS_FROZEN)
 #define CPU_DEAD_FROZEN(CPU_DEAD | CPU_TASKS_FROZEN)
-#define CPU_DYING_FROZEN   (CPU_DYING | CPU_TASKS_FROZEN)
 
 #ifdef CONFIG_SMP
 extern bool cpuhp_tasks_frozen;
Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -29,7 +29,6 @@ enum cpuhp_states {
CPUHP_AP_ARM_VFP_STARTING,
CPUHP_AP_ARM64_TIMER_STARTING,
CPUHP_AP_KVM_STARTING,
-   CPUHP_AP_NOTIFY_DYING,
CPUHP_AP_CLOCKEVENTS_DYING,
CPUHP_AP_RCUTREE_DYING,
CPUHP_AP_X86_TBOOT_DYING,
Index: linux-2.6/kernel/cpu.c
===
--- linux-2.6.orig/kernel/cpu.c
+++ linux-2.6/kernel/cpu.c
@@ -303,12 +303,6 @@ static int notify_down_prepare(unsigned 
return err;
 }
 
-static int notify_dying(unsigned int cpu)
-{
-   cpu_notify(CPU_DYING, cpu);
-   return 0;
-}
-
 /* Take this CPU down. */
 static int __ref take_cpu_down(void *_param)
 {
@@ -366,7 +360,6 @@ static int notify_dead(unsigned int cpu)
 #define notify_down_prepareNULL
 #define takedown_cpu   NULL
 #define notify_deadNULL
-#define notify_dying   NULL
 #endif
 
 #ifdef CONFIG_HOTPLUG_CPU
@@ -825,10 +818,6 @@ static struct cpuhp_step cpuhp_ap_states
.startup = sched_starting_cpu,
.teardown = NULL,
},
-   [CPUHP_AP_NOTIFY_DYING] = {
-   .startup = NULL,
-   .teardown = notify_dying,
-   },
[CPUHP_AP_CLOCKEVENTS_DYING] = {
.startup = NULL,
.teardown = clockevents_dying_cpu,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 33/40] hrtimer: Convert to hotplug state machine

2013-01-31 Thread Thomas Gleixner

Split out the clockevents callbacks instead of piggypacking them on
hrtimers.

This gets rid of a POST_DEAD user. See commit 54e88fad. We just move
the callback state to the proper place in the state machine.

Signed-off-by: Thomas Gleixner 
---
 include/linux/cpuhotplug.h |   18 +
 kernel/cpu.c   |   12 +++
 kernel/hrtimer.c   |   47 -
 kernel/time/clockevents.c  |   13 
 4 files changed, 48 insertions(+), 42 deletions(-)

Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -13,8 +13,10 @@ enum cpuhp_states {
CPUHP_SCHED_MIGRATE_PREP,
CPUHP_WORKQUEUE_PREP,
CPUHP_RCUTREE_PREPARE,
+   CPUHP_HRTIMERS_PREPARE,
CPUHP_NOTIFY_PREPARE,
CPUHP_NOTIFY_DEAD,
+   CPUHP_CLOCKEVENTS_DEAD,
CPUHP_CPUFREQ_DEAD,
CPUHP_SCHED_DEAD,
CPUHP_BRINGUP_CPU,
@@ -28,6 +30,7 @@ enum cpuhp_states {
CPUHP_AP_ARM64_TIMER_STARTING,
CPUHP_AP_KVM_STARTING,
CPUHP_AP_NOTIFY_DYING,
+   CPUHP_AP_CLOCKEVENTS_DYING,
CPUHP_AP_RCUTREE_DYING,
CPUHP_AP_X86_TBOOT_DYING,
CPUHP_AP_S390_VTIME_DYING,
@@ -165,4 +168,19 @@ int rcutree_dying_cpu(unsigned int cpu);
 #define rcutree_dying_cpu  NULL
 #endif
 
+#ifdef CONFIG_GENERIC_CLOCKEVENTS
+int clockevents_dying_cpu(unsigned int cpu);
+int clockevents_dead_cpu(unsigned int cpu);
+#else
+#define clockevents_dying_cpu  NULL
+#define clockevents_dead_cpu   NULL
+#endif
+
+int hrtimers_prepare_cpu(unsigned int cpu);
+#ifdef CONFIG_HOTPLUG_CPU
+int hrtimers_dead_cpu(unsigned int cpu);
+#else
+#define hrtimers_dead_cpu  NULL
+#endif
+
 #endif
Index: linux-2.6/kernel/cpu.c
===
--- linux-2.6.orig/kernel/cpu.c
+++ linux-2.6/kernel/cpu.c
@@ -759,6 +759,10 @@ static struct cpuhp_step cpuhp_bp_states
.startup = rcutree_prepare_cpu,
.teardown = rcutree_dead_cpu,
},
+   [CPUHP_HRTIMERS_PREPARE] = {
+   .startup = hrtimers_prepare_cpu,
+   .teardown = hrtimers_dead_cpu,
+   },
[CPUHP_NOTIFY_PREPARE] = {
.startup = notify_prepare,
.teardown = NULL,
@@ -767,6 +771,10 @@ static struct cpuhp_step cpuhp_bp_states
.startup = NULL,
.teardown = notify_dead,
},
+   [CPUHP_CLOCKEVENTS_DEAD] = {
+   .startup = NULL,
+   .teardown = clockevents_dead_cpu,
+   },
[CPUHP_BRINGUP_CPU] = {
.startup = bringup_cpu,
.teardown = NULL,
@@ -821,6 +829,10 @@ static struct cpuhp_step cpuhp_ap_states
.startup = NULL,
.teardown = notify_dying,
},
+   [CPUHP_AP_CLOCKEVENTS_DYING] = {
+   .startup = NULL,
+   .teardown = clockevents_dying_cpu,
+   },
[CPUHP_AP_RCUTREE_DYING] = {
.startup = NULL,
.teardown = rcutree_dying_cpu,
Index: linux-2.6/kernel/hrtimer.c
===
--- linux-2.6.orig/kernel/hrtimer.c
+++ linux-2.6/kernel/hrtimer.c
@@ -1635,7 +1635,7 @@ SYSCALL_DEFINE2(nanosleep, struct timesp
 /*
  * Functions related to boot-time initialization:
  */
-static void __cpuinit init_hrtimers_cpu(int cpu)
+int __cpuinit hrtimers_prepare_cpu(unsigned int cpu)
 {
struct hrtimer_cpu_base *cpu_base = &per_cpu(hrtimer_bases, cpu);
int i;
@@ -1648,6 +1648,7 @@ static void __cpuinit init_hrtimers_cpu(
}
 
hrtimer_init_hres(cpu_base);
+   return 0;
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
@@ -1685,7 +1686,7 @@ static void migrate_hrtimer_list(struct 
}
 }
 
-static void migrate_hrtimers(int scpu)
+int __cpuinit hrtimers_dead_cpu(unsigned int scpu)
 {
struct hrtimer_cpu_base *old_base, *new_base;
int i;
@@ -1714,52 +1715,14 @@ static void migrate_hrtimers(int scpu)
/* Check, if we got expired work to do */
__hrtimer_peek_ahead_timers();
local_irq_enable();
+   return 0;
 }
 
 #endif /* CONFIG_HOTPLUG_CPU */
 
-static int __cpuinit hrtimer_cpu_notify(struct notifier_block *self,
-   unsigned long action, void *hcpu)
-{
-   int scpu = (long)hcpu;
-
-   switch (action) {
-
-   case CPU_UP_PREPARE:
-   case CPU_UP_PREPARE_FROZEN:
-   init_hrtimers_cpu(scpu);
-   break;
-
-#ifdef CONFIG_HOTPLUG_CPU
-   case CPU_DYING:
-   case CPU_DYING_FROZEN:
-   clockevents_notify(CLOCK_EVT_NOTIFY_CPU_DYING, &scpu);
-   break;
-   case CPU_DEAD:
-   case CPU_DEAD_FROZEN:
-   {
-   clockevents_notify(CLOCK_EVT_NOTIFY_CPU_DEA

[patch 32/40] rcu: Convert rcutree to hotplug state machine

2013-01-31 Thread Thomas Gleixner

Do we really need so many states here ?

Signed-off-by: Thomas Gleixner 
---
 include/linux/cpuhotplug.h |   18 
 kernel/cpu.c   |   12 +
 kernel/rcutree.c   |   95 -
 3 files changed, 73 insertions(+), 52 deletions(-)

Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -12,6 +12,7 @@ enum cpuhp_states {
CPUHP_PERF_PREPARE,
CPUHP_SCHED_MIGRATE_PREP,
CPUHP_WORKQUEUE_PREP,
+   CPUHP_RCUTREE_PREPARE,
CPUHP_NOTIFY_PREPARE,
CPUHP_NOTIFY_DEAD,
CPUHP_CPUFREQ_DEAD,
@@ -27,6 +28,7 @@ enum cpuhp_states {
CPUHP_AP_ARM64_TIMER_STARTING,
CPUHP_AP_KVM_STARTING,
CPUHP_AP_NOTIFY_DYING,
+   CPUHP_AP_RCUTREE_DYING,
CPUHP_AP_X86_TBOOT_DYING,
CPUHP_AP_S390_VTIME_DYING,
CPUHP_AP_SCHED_NOHZ_DYING,
@@ -39,6 +41,7 @@ enum cpuhp_states {
CPUHP_SCHED_MIGRATE_ONLINE,
CPUHP_WORKQUEUE_ONLINE,
CPUHP_CPUFREQ_ONLINE,
+   CPUHP_RCUTREE_ONLINE,
CPUHP_NOTIFY_ONLINE,
CPUHP_NOTIFY_DOWN_PREPARE,
CPUHP_PERF_X86_UNCORE_ONLINE,
@@ -147,4 +150,19 @@ int workqueue_prepare_cpu(unsigned int c
 int workqueue_online_cpu(unsigned int cpu);
 int workqueue_offline_cpu(unsigned int cpu);
 
+/* RCUtree hotplug events */
+#if defined(CONFIG_TREE_RCU) || defined(CONFIG_TREE_PREEMPT_RCU)
+int rcutree_prepare_cpu(unsigned int cpu);
+int rcutree_online_cpu(unsigned int cpu);
+int rcutree_offline_cpu(unsigned int cpu);
+int rcutree_dead_cpu(unsigned int cpu);
+int rcutree_dying_cpu(unsigned int cpu);
+#else
+#define rcutree_prepare_cpuNULL
+#define rcutree_online_cpu NULL
+#define rcutree_offline_cpuNULL
+#define rcutree_dead_cpu   NULL
+#define rcutree_dying_cpu  NULL
+#endif
+
 #endif
Index: linux-2.6/kernel/cpu.c
===
--- linux-2.6.orig/kernel/cpu.c
+++ linux-2.6/kernel/cpu.c
@@ -755,6 +755,10 @@ static struct cpuhp_step cpuhp_bp_states
.startup = workqueue_prepare_cpu,
.teardown = NULL,
},
+   [CPUHP_RCUTREE_PREPARE] = {
+   .startup = rcutree_prepare_cpu,
+   .teardown = rcutree_dead_cpu,
+   },
[CPUHP_NOTIFY_PREPARE] = {
.startup = notify_prepare,
.teardown = NULL,
@@ -787,6 +791,10 @@ static struct cpuhp_step cpuhp_bp_states
.startup = workqueue_online_cpu,
.teardown = workqueue_offline_cpu,
},
+   [CPUHP_RCUTREE_ONLINE] = {
+   .startup = rcutree_online_cpu,
+   .teardown = rcutree_offline_cpu,
+   },
[CPUHP_NOTIFY_ONLINE] = {
.startup = notify_online,
.teardown = NULL,
@@ -813,6 +821,10 @@ static struct cpuhp_step cpuhp_ap_states
.startup = NULL,
.teardown = notify_dying,
},
+   [CPUHP_AP_RCUTREE_DYING] = {
+   .startup = NULL,
+   .teardown = rcutree_dying_cpu,
+   },
[CPUHP_AP_SCHED_NOHZ_DYING] = {
.startup = NULL,
.teardown = nohz_balance_exit_idle,
Index: linux-2.6/kernel/rcutree.c
===
--- linux-2.6.orig/kernel/rcutree.c
+++ linux-2.6/kernel/rcutree.c
@@ -2787,67 +2787,59 @@ rcu_init_percpu_data(int cpu, struct rcu
mutex_unlock(&rsp->onoff_mutex);
 }
 
-static void __cpuinit rcu_prepare_cpu(int cpu)
+int __cpuinit rcutree_prepare_cpu(unsigned int cpu)
 {
struct rcu_state *rsp;
 
for_each_rcu_flavor(rsp)
rcu_init_percpu_data(cpu, rsp,
 strcmp(rsp->name, "rcu_preempt") == 0);
+   rcu_prepare_kthreads(cpu);
+   return 0;
 }
 
-/*
- * Handle CPU online/offline notification events.
- */
-static int __cpuinit rcu_cpu_notify(struct notifier_block *self,
-   unsigned long action, void *hcpu)
+int __cpuinit rcutree_dead_cpu(unsigned int cpu)
 {
-   long cpu = (long)hcpu;
-   struct rcu_data *rdp = per_cpu_ptr(rcu_state->rda, cpu);
-   struct rcu_node *rnp = rdp->mynode;
struct rcu_state *rsp;
-   int ret = NOTIFY_OK;
 
-   trace_rcu_utilization("Start CPU hotplug");
-   switch (action) {
-   case CPU_UP_PREPARE:
-   case CPU_UP_PREPARE_FROZEN:
-   rcu_prepare_cpu(cpu);
-   rcu_prepare_kthreads(cpu);
-   break;
-   case CPU_ONLINE:
-   case CPU_DOWN_FAILED:
-   rcu_boost_kthread_setaffinity(rnp, -1);
-   break;
-   case CPU_DOWN_PREPARE:
-   if (nocb_cpu_expendable(cpu))
-   rcu_boost_kthread_setaffi

[patch 03/40] stop_machine: Use smpboot threads

2013-01-31 Thread Thomas Gleixner

Use the smpboot thread infrastructure. Mark the stopper thread
selfparking and park it after it has finished the take_cpu_down()
work.

Signed-off-by: Thomas Gleixner 
---
 kernel/cpu.c  |2 
 kernel/stop_machine.c |  134 ++
 2 files changed, 51 insertions(+), 85 deletions(-)

Index: linux-2.6/kernel/cpu.c
===
--- linux-2.6.orig/kernel/cpu.c
+++ linux-2.6/kernel/cpu.c
@@ -254,6 +254,8 @@ static int __ref take_cpu_down(void *_pa
return err;
 
cpu_notify(CPU_DYING | param->mod, param->hcpu);
+   /* Park the stopper thread */
+   kthread_park(current);
return 0;
 }
 
Index: linux-2.6/kernel/stop_machine.c
===
--- linux-2.6.orig/kernel/stop_machine.c
+++ linux-2.6/kernel/stop_machine.c
@@ -18,7 +18,7 @@
 #include 
 #include 
 #include 
-
+#include 
 #include 
 
 /*
@@ -245,20 +245,25 @@ int try_stop_cpus(const struct cpumask *
return ret;
 }
 
-static int cpu_stopper_thread(void *data)
+static int cpu_stop_should_run(unsigned int cpu)
+{
+   struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu);
+   unsigned long flags;
+   int run;
+
+   spin_lock_irqsave(&stopper->lock, flags);
+   run = !list_empty(&stopper->works);
+   spin_unlock_irqrestore(&stopper->lock, flags);
+   return run;
+}
+
+static void cpu_stopper_thread(unsigned int cpu)
 {
-   struct cpu_stopper *stopper = data;
+   struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu);
struct cpu_stop_work *work;
int ret;
 
 repeat:
-   set_current_state(TASK_INTERRUPTIBLE);  /* mb paired w/ kthread_stop */
-
-   if (kthread_should_stop()) {
-   __set_current_state(TASK_RUNNING);
-   return 0;
-   }
-
work = NULL;
spin_lock_irq(&stopper->lock);
if (!list_empty(&stopper->works)) {
@@ -274,8 +279,6 @@ repeat:
struct cpu_stop_done *done = work->done;
char ksym_buf[KSYM_NAME_LEN] __maybe_unused;
 
-   __set_current_state(TASK_RUNNING);
-
/* cpu stop callbacks are not allowed to sleep */
preempt_disable();
 
@@ -291,87 +294,55 @@ repeat:
  ksym_buf), arg);
 
cpu_stop_signal_done(done, true);
-   } else
-   schedule();
-
-   goto repeat;
+   goto repeat;
+   }
 }
 
 extern void sched_set_stop_task(int cpu, struct task_struct *stop);
 
-/* manage stopper for a cpu, mostly lifted from sched migration thread mgmt */
-static int __cpuinit cpu_stop_cpu_callback(struct notifier_block *nfb,
-  unsigned long action, void *hcpu)
+static void cpu_stop_create(unsigned int cpu)
+{
+   sched_set_stop_task(cpu, per_cpu(cpu_stopper_task, cpu));
+}
+
+static void cpu_stop_park(unsigned int cpu)
 {
-   unsigned int cpu = (unsigned long)hcpu;
struct cpu_stopper *stopper = &per_cpu(cpu_stopper, cpu);
-   struct task_struct *p = per_cpu(cpu_stopper_task, cpu);
+   struct cpu_stop_work *work;
+   unsigned long flags;
 
-   switch (action & ~CPU_TASKS_FROZEN) {
-   case CPU_UP_PREPARE:
-   BUG_ON(p || stopper->enabled || !list_empty(&stopper->works));
-   p = kthread_create_on_node(cpu_stopper_thread,
-  stopper,
-  cpu_to_node(cpu),
-  "migration/%d", cpu);
-   if (IS_ERR(p))
-   return notifier_from_errno(PTR_ERR(p));
-   get_task_struct(p);
-   kthread_bind(p, cpu);
-   sched_set_stop_task(cpu, p);
-   per_cpu(cpu_stopper_task, cpu) = p;
-   break;
+   /* drain remaining works */
+   spin_lock_irqsave(&stopper->lock, flags);
+   list_for_each_entry(work, &stopper->works, list)
+   cpu_stop_signal_done(work->done, false);
+   stopper->enabled = false;
+   spin_unlock_irqrestore(&stopper->lock, flags);
+}
 
-   case CPU_ONLINE:
-   /* strictly unnecessary, as first user will wake it */
-   wake_up_process(p);
-   /* mark enabled */
-   spin_lock_irq(&stopper->lock);
-   stopper->enabled = true;
-   spin_unlock_irq(&stopper->lock);
-   break;
-
-#ifdef CONFIG_HOTPLUG_CPU
-   case CPU_UP_CANCELED:
-   case CPU_POST_DEAD:
-   {
-   struct cpu_stop_work *work;
-
-   sched_set_stop_task(cpu, NULL);
-   /* kill the stopper */
-   kthread_stop(p);
-   /* drain

[patch 11/40] x86: uncore: Move teardown callback to CPU_DEAD

2013-01-31 Thread Thomas Gleixner

No point calling this from the dying cpu.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Index: linux-2.6/arch/x86/kernel/cpu/perf_event_intel_uncore.c
===
--- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ linux-2.6/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -2622,7 +2622,7 @@ static void __init uncore_pci_exit(void)
}
 }
 
-static void __cpuinit uncore_cpu_dying(int cpu)
+static void __cpuinit uncore_cpu_dead(int cpu)
 {
struct intel_uncore_type *type;
struct intel_uncore_pmu *pmu;
@@ -2803,8 +2803,8 @@ static int
uncore_cpu_starting(cpu);
break;
case CPU_UP_CANCELED:
-   case CPU_DYING:
-   uncore_cpu_dying(cpu);
+   case CPU_DEAD:
+   uncore_cpu_dead(cpu);
break;
default:
break;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 12/40] x86: uncore: Convert to hotplug state machine

2013-01-31 Thread Thomas Gleixner

Convert the notifiers to state machine states and let the core code do
the setup for the already online cpus. This notifier has a completely
undocumented ordering requirement versus perf hardcoded in the
notifier priority. Move the callback to the proper place in the state
machine.

Note, the original code did not check the return values of the setup
functions and I could not be bothered to twist my brain around undoing
the previous steps. Marked with a FIXME.

Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/cpu/perf_event_intel_uncore.c |  109 ++
 include/linux/cpuhotplug.h|3 
 2 files changed, 30 insertions(+), 82 deletions(-)

Index: linux-2.6/arch/x86/kernel/cpu/perf_event_intel_uncore.c
===
--- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ linux-2.6/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -2622,7 +2622,7 @@ static void __init uncore_pci_exit(void)
}
 }
 
-static void __cpuinit uncore_cpu_dead(int cpu)
+static int __cpuinit uncore_dead_cpu(unsigned int cpu)
 {
struct intel_uncore_type *type;
struct intel_uncore_pmu *pmu;
@@ -2639,9 +2639,11 @@ static void __cpuinit uncore_cpu_dead(in
kfree(box);
}
}
+   return 0;
 }
 
-static int __cpuinit uncore_cpu_starting(int cpu)
+/* Must run on the target cpu */
+static int __cpuinit uncore_starting_cpu(unsigned int cpu)
 {
struct intel_uncore_type *type;
struct intel_uncore_pmu *pmu;
@@ -2681,12 +2683,12 @@ static int __cpuinit uncore_cpu_starting
return 0;
 }
 
-static int __cpuinit uncore_cpu_prepare(int cpu, int phys_id)
+static int __cpuinit uncore_prepare_cpu(unsigned int cpu)
 {
struct intel_uncore_type *type;
struct intel_uncore_pmu *pmu;
struct intel_uncore_box *box;
-   int i, j;
+   int i, j, phys_id = -1;
 
for (i = 0; msr_uncores[i]; i++) {
type = msr_uncores[i];
@@ -2745,13 +2747,13 @@ uncore_change_context(struct intel_uncor
}
 }
 
-static void __cpuinit uncore_event_exit_cpu(int cpu)
+static int __cpuinit uncore_offline_cpu(unsigned int cpu)
 {
int i, phys_id, target;
 
/* if exiting cpu is used for collecting uncore events */
if (!cpumask_test_and_clear_cpu(cpu, &uncore_cpu_mask))
-   return;
+   return 0;
 
/* find a new cpu to collect uncore events */
phys_id = topology_physical_package_id(cpu);
@@ -2771,78 +2773,29 @@ static void __cpuinit uncore_event_exit_
 
uncore_change_context(msr_uncores, cpu, target);
uncore_change_context(pci_uncores, cpu, target);
+   return 0;
 }
 
-static void __cpuinit uncore_event_init_cpu(int cpu)
+static int __cpuinit uncore_online_cpu(unsigned int cpu)
 {
int i, phys_id;
 
phys_id = topology_physical_package_id(cpu);
for_each_cpu(i, &uncore_cpu_mask) {
if (phys_id == topology_physical_package_id(i))
-   return;
+   return 0;
}
 
cpumask_set_cpu(cpu, &uncore_cpu_mask);
 
uncore_change_context(msr_uncores, -1, cpu);
uncore_change_context(pci_uncores, -1, cpu);
-}
-
-static int
- __cpuinit uncore_cpu_notifier(struct notifier_block *self, unsigned long 
action, void *hcpu)
-{
-   unsigned int cpu = (long)hcpu;
-
-   /* allocate/free data structure for uncore box */
-   switch (action & ~CPU_TASKS_FROZEN) {
-   case CPU_UP_PREPARE:
-   uncore_cpu_prepare(cpu, -1);
-   break;
-   case CPU_STARTING:
-   uncore_cpu_starting(cpu);
-   break;
-   case CPU_UP_CANCELED:
-   case CPU_DEAD:
-   uncore_cpu_dead(cpu);
-   break;
-   default:
-   break;
-   }
-
-   /* select the cpu that collects uncore events */
-   switch (action & ~CPU_TASKS_FROZEN) {
-   case CPU_DOWN_FAILED:
-   case CPU_STARTING:
-   uncore_event_init_cpu(cpu);
-   break;
-   case CPU_DOWN_PREPARE:
-   uncore_event_exit_cpu(cpu);
-   break;
-   default:
-   break;
-   }
-
-   return NOTIFY_OK;
-}
-
-static struct notifier_block uncore_cpu_nb __cpuinitdata = {
-   .notifier_call  = uncore_cpu_notifier,
-   /*
-* to migrate uncore events, our notifier should be executed
-* before perf core's notifier.
-*/
-   .priority   = CPU_PRI_PERF + 1,
-};
-
-static void __init uncore_cpu_setup(void *dummy)
-{
-   uncore_cpu_starting(smp_processor_id());
+   return 0;
 }
 
 static int __init uncore_cpu_init(void)
 {
-   int ret, cpu, max_cores;
+   int ret, max_cores;
 
max_cores = boot_cpu_data.x86_max_cores;
switch (boot_cpu_data.x86_model) {
@@ -2879,28 +283

[patch 31/40] sched: Convert fair nohz balancer to hotplug state machine

2013-01-31 Thread Thomas Gleixner

Straight forward conversion which leaves the question whether this
couldn't be combined with already existing infrastructure in the
scheduler instead of having an extra state.

Signed-off-by: Thomas Gleixner 
---
 include/linux/cpuhotplug.h |6 ++
 kernel/cpu.c   |4 
 kernel/sched/fair.c|   16 ++--
 3 files changed, 12 insertions(+), 14 deletions(-)

Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -29,6 +29,7 @@ enum cpuhp_states {
CPUHP_AP_NOTIFY_DYING,
CPUHP_AP_X86_TBOOT_DYING,
CPUHP_AP_S390_VTIME_DYING,
+   CPUHP_AP_SCHED_NOHZ_DYING,
CPUHP_AP_SCHED_MIGRATE_DYING,
CPUHP_AP_MAX,
CPUHP_TEARDOWN_CPU,
@@ -126,6 +127,11 @@ int sched_migration_dead_cpu(unsigned in
 #define sched_migration_dying_cpu  NULL
 #define sched_migration_dead_cpu   NULL
 #endif
+#if defined(CONFIG_NO_HZ)
+int nohz_balance_exit_idle(unsigned int cpu);
+#else
+#define nohz_balance_exit_idle NULL
+#endif
 
  /* Performance counter hotplug functions */
 #ifdef CONFIG_PERF_EVENTS
Index: linux-2.6/kernel/cpu.c
===
--- linux-2.6.orig/kernel/cpu.c
+++ linux-2.6/kernel/cpu.c
@@ -813,6 +813,10 @@ static struct cpuhp_step cpuhp_ap_states
.startup = NULL,
.teardown = notify_dying,
},
+   [CPUHP_AP_SCHED_NOHZ_DYING] = {
+   .startup = NULL,
+   .teardown = nohz_balance_exit_idle,
+   },
[CPUHP_AP_SCHED_MIGRATE_DYING] = {
.startup = NULL,
.teardown = sched_migration_dying_cpu,
Index: linux-2.6/kernel/sched/fair.c
===
--- linux-2.6.orig/kernel/sched/fair.c
+++ linux-2.6/kernel/sched/fair.c
@@ -5390,13 +5390,14 @@ static void nohz_balancer_kick(int cpu)
return;
 }
 
-static inline void nohz_balance_exit_idle(int cpu)
+int nohz_balance_exit_idle(unsigned int cpu)
 {
if (unlikely(test_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu {
cpumask_clear_cpu(cpu, nohz.idle_cpus_mask);
atomic_dec(&nohz.nr_cpus);
clear_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu));
}
+   return 0;
 }
 
 static inline void set_cpu_sd_state_busy(void)
@@ -5448,18 +5449,6 @@ void nohz_balance_enter_idle(int cpu)
atomic_inc(&nohz.nr_cpus);
set_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu));
 }
-
-static int __cpuinit sched_ilb_notifier(struct notifier_block *nfb,
-   unsigned long action, void *hcpu)
-{
-   switch (action & ~CPU_TASKS_FROZEN) {
-   case CPU_DYING:
-   nohz_balance_exit_idle(smp_processor_id());
-   return NOTIFY_OK;
-   default:
-   return NOTIFY_DONE;
-   }
-}
 #endif
 
 static DEFINE_SPINLOCK(balancing);
@@ -6167,7 +6156,6 @@ __init void init_sched_fair_class(void)
 #ifdef CONFIG_NO_HZ
nohz.next_balance = jiffies;
zalloc_cpumask_var(&nohz.idle_cpus_mask, GFP_NOWAIT);
-   cpu_notifier(sched_ilb_notifier, 0);
 #endif
 #endif /* SMP */
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 29/40] s390: Convert vtime to hotplug state machine

2013-01-31 Thread Thomas Gleixner

Signed-off-by: Thomas Gleixner 
---
 arch/s390/kernel/vtime.c   |   18 +-
 include/linux/cpuhotplug.h |1 +
 2 files changed, 6 insertions(+), 13 deletions(-)

Index: linux-2.6/arch/s390/kernel/vtime.c
===
--- linux-2.6.orig/arch/s390/kernel/vtime.c
+++ linux-2.6/arch/s390/kernel/vtime.c
@@ -382,25 +382,17 @@ void __cpuinit init_cpu_vtimer(void)
set_vtimer(VTIMER_MAX_SLICE);
 }
 
-static int __cpuinit s390_nohz_notify(struct notifier_block *self,
- unsigned long action, void *hcpu)
+static int __cpuinit s390_vtime_dying_cpu(unsigned int cpu)
 {
-   struct s390_idle_data *idle;
-   long cpu = (long) hcpu;
+   struct s390_idle_data *idle = &per_cpu(s390_idle, cpu);
 
-   idle = &per_cpu(s390_idle, cpu);
-   switch (action & ~CPU_TASKS_FROZEN) {
-   case CPU_DYING:
-   idle->nohz_delay = 0;
-   default:
-   break;
-   }
-   return NOTIFY_OK;
+   idle->nohz_delay = 0;
+   return 0;
 }
 
 void __init vtime_init(void)
 {
/* Enable cpu timer interrupts on the boot cpu. */
init_cpu_vtimer();
-   cpu_notifier(s390_nohz_notify, 0);
+   cpuhp_setup_state(CPUHP_AP_S390_VTIME_DYING, NULL, 
s390_vtime_dying_cpu);
 }
Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -27,6 +27,7 @@ enum cpuhp_states {
CPUHP_AP_ARM64_TIMER_STARTING,
CPUHP_AP_KVM_STARTING,
CPUHP_AP_NOTIFY_DYING,
+   CPUHP_AP_S390_VTIME_DYING,
CPUHP_AP_SCHED_MIGRATE_DYING,
CPUHP_AP_MAX,
CPUHP_TEARDOWN_CPU,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 28/40] cpuhotplug: Remove CPU_STARTING notifier

2013-01-31 Thread Thomas Gleixner

All users converted to state machine.

Signed-off-by: Thomas Gleixner 
---
 include/linux/cpu.h|6 --
 include/linux/cpuhotplug.h |1 -
 kernel/cpu.c   |   13 +
 3 files changed, 1 insertion(+), 19 deletions(-)

Index: linux-2.6/include/linux/cpu.h
===
--- linux-2.6.orig/include/linux/cpu.h
+++ linux-2.6/include/linux/cpu.h
@@ -67,10 +67,6 @@ extern ssize_t arch_print_cpu_modalias(s
* sleep, must not fail */
 #define CPU_POST_DEAD  0x0009 /* CPU (unsigned)v dead, cpu_hotplug
* lock is dropped */
-#define CPU_STARTING   0x000A /* CPU (unsigned)v soon running.
-   * Called on the new cpu, just before
-   * enabling interrupts. Must not sleep,
-   * must not fail */
 
 /* Used for CPU hotplug events occurring while tasks are frozen due to a 
suspend
  * operation in progress
@@ -84,8 +80,6 @@ extern ssize_t arch_print_cpu_modalias(s
 #define CPU_DOWN_FAILED_FROZEN (CPU_DOWN_FAILED | CPU_TASKS_FROZEN)
 #define CPU_DEAD_FROZEN(CPU_DEAD | CPU_TASKS_FROZEN)
 #define CPU_DYING_FROZEN   (CPU_DYING | CPU_TASKS_FROZEN)
-#define CPU_STARTING_FROZEN(CPU_STARTING | CPU_TASKS_FROZEN)
-
 
 #ifdef CONFIG_SMP
 extern bool cpuhp_tasks_frozen;
Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -26,7 +26,6 @@ enum cpuhp_states {
CPUHP_AP_ARM_VFP_STARTING,
CPUHP_AP_ARM64_TIMER_STARTING,
CPUHP_AP_KVM_STARTING,
-   CPUHP_AP_NOTIFY_STARTING,
CPUHP_AP_NOTIFY_DYING,
CPUHP_AP_SCHED_MIGRATE_DYING,
CPUHP_AP_MAX,
Index: linux-2.6/kernel/cpu.c
===
--- linux-2.6.orig/kernel/cpu.c
+++ linux-2.6/kernel/cpu.c
@@ -216,12 +216,6 @@ static int bringup_cpu(unsigned int cpu)
return 0;
 }
 
-static int notify_starting(unsigned int cpu)
-{
-   cpu_notify(CPU_STARTING, cpu);
-   return 0;
-}
-
 #ifdef CONFIG_HOTPLUG_CPU
 EXPORT_SYMBOL(register_cpu_notifier);
 
@@ -446,10 +440,9 @@ EXPORT_SYMBOL(cpu_down);
 #endif /*CONFIG_HOTPLUG_CPU*/
 
 /**
- * notify_cpu_starting(cpu) - call the CPU_STARTING notifiers
+ * notify_cpu_starting(cpu) - Invoke the callbacks on the starting CPU
  * @cpu: cpu that just started
  *
- * This function calls the cpu_chain notifiers with CPU_STARTING.
  * It must be called by the arch code on the new cpu, before the new cpu
  * enables interrupts and before the "boot" cpu returns from __cpu_up().
  */
@@ -816,10 +809,6 @@ static struct cpuhp_step cpuhp_ap_states
.startup = sched_starting_cpu,
.teardown = NULL,
},
-   [CPUHP_AP_NOTIFY_STARTING] = {
-   .startup = notify_starting,
-   .teardown = NULL,
-   },
[CPUHP_AP_NOTIFY_DYING] = {
.startup = NULL,
.teardown = notify_dying,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 26/40] arm: perf: Convert to hotplug state machine

2013-01-31 Thread Thomas Gleixner

Straight forward conversion w/o bells and whistles.

Signed-off-by: Thomas Gleixner 
---
 arch/arm/kernel/perf_event_cpu.c |   28 +---
 include/linux/cpuhotplug.h   |1 +
 2 files changed, 6 insertions(+), 23 deletions(-)

Index: linux-2.6/arch/arm/kernel/perf_event_cpu.c
===
--- linux-2.6.orig/arch/arm/kernel/perf_event_cpu.c
+++ linux-2.6/arch/arm/kernel/perf_event_cpu.c
@@ -157,24 +157,13 @@ static void cpu_pmu_init(struct arm_pmu 
  * UNKNOWN at reset, the PMU must be explicitly reset to avoid reading
  * junk values out of them.
  */
-static int __cpuinit cpu_pmu_notify(struct notifier_block *b,
-   unsigned long action, void *hcpu)
+static int __cpuinit arm_perf_starting_cpu(unsigned int cpu)
 {
-   if ((action & ~CPU_TASKS_FROZEN) != CPU_STARTING)
-   return NOTIFY_DONE;
-
if (cpu_pmu && cpu_pmu->reset)
cpu_pmu->reset(cpu_pmu);
-   else
-   return NOTIFY_DONE;
-
-   return NOTIFY_OK;
+   return 0;
 }
 
-static struct notifier_block __cpuinitdata cpu_pmu_hotplug_notifier = {
-   .notifier_call = cpu_pmu_notify,
-};
-
 /*
  * PMU platform driver and devicetree bindings.
  */
@@ -304,16 +293,9 @@ static struct platform_driver cpu_pmu_dr
 
 static int __init register_pmu_driver(void)
 {
-   int err;
-
-   err = register_cpu_notifier(&cpu_pmu_hotplug_notifier);
-   if (err)
-   return err;
-
-   err = platform_driver_register(&cpu_pmu_driver);
-   if (err)
-   unregister_cpu_notifier(&cpu_pmu_hotplug_notifier);
+   int err = platform_driver_register(&cpu_pmu_driver);
 
-   return err;
+   return err ? err : cpuhp_setup_state_nocalls(CPUHP_AP_PERF_ARM_STARTING,
+arm_perf_starting_cpu, 
NULL);
 }
 device_initcall(register_pmu_driver);
Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -22,6 +22,7 @@ enum cpuhp_states {
CPUHP_AP_PERF_X86_UNCORE_STARTING,
CPUHP_AP_PERF_X86_AMD_IBS_STARTING,
CPUHP_AP_PERF_X86_STARTING,
+   CPUHP_AP_PERF_ARM_STARTING,
CPUHP_AP_ARM_VFP_STARTING,
CPUHP_AP_ARM64_TIMER_STARTING,
CPUHP_AP_NOTIFY_STARTING,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 25/40] arm: Convert VFP hotplug notifiers to state machine

2013-01-31 Thread Thomas Gleixner

Straight forward conversion plus commentry why code which is executed
in hotplug callbacks needs to be invoked before installing them.

Signed-off-by: Thomas Gleixner 
---
 arch/arm/vfp/vfpmodule.c   |   29 +
 include/linux/cpuhotplug.h |1 +
 2 files changed, 18 insertions(+), 12 deletions(-)

Index: linux-2.6/arch/arm/vfp/vfpmodule.c
===
--- linux-2.6.orig/arch/arm/vfp/vfpmodule.c
+++ linux-2.6/arch/arm/vfp/vfpmodule.c
@@ -633,19 +633,19 @@ int vfp_restore_user_hwstate(struct user
  * hardware state at every thread switch.  We clear our held state when
  * a CPU has been killed, indicating that the VFP hardware doesn't contain
  * a threads VFP state.  When a CPU starts up, we re-enable access to the
- * VFP hardware.
- *
- * Both CPU_DYING and CPU_STARTING are called on the CPU which
+ * VFP hardware. The callbacks below are called on the CPU which
  * is being offlined/onlined.
  */
-static int vfp_hotplug(struct notifier_block *b, unsigned long action,
-   void *hcpu)
+static int __cpuinit vfp_dying_cpu(unsigned int cpu)
 {
-   if (action == CPU_DYING || action == CPU_DYING_FROZEN) {
-   vfp_force_reload((long)hcpu, current_thread_info());
-   } else if (action == CPU_STARTING || action == CPU_STARTING_FROZEN)
-   vfp_enable(NULL);
-   return NOTIFY_OK;
+   vfp_force_reload(cpu, current_thread_info());
+   return 0;
+}
+
+static int __cpuinit vfp_starting_cpu(unsigned int unused)
+{
+   vfp_enable(NULL);
+   return 0;
 }
 
 /*
@@ -653,9 +653,13 @@ static int vfp_hotplug(struct notifier_b
  */
 static int __init vfp_init(void)
 {
-   unsigned int vfpsid;
unsigned int cpu_arch = cpu_architecture();
+   unsigned int vfpsid;
 
+   /*
+* Enable the access to the VFP on all online cpus so the
+* following test on FPSID will succeed.
+*/
if (cpu_arch >= CPU_ARCH_ARMv6)
on_each_cpu(vfp_enable, NULL, 1);
 
@@ -676,7 +680,8 @@ static int __init vfp_init(void)
else if (vfpsid & FPSID_NODOUBLE) {
pr_cont("no double precision support\n");
} else {
-   hotcpu_notifier(vfp_hotplug, 0);
+   cpuhp_setup_state_nocall(CPUHP_AP_ARM_VFP_STARTING,
+vfp_starting_cpu, vfp_dying_cpu);
 
VFP_arch = (vfpsid & FPSID_ARCH_MASK) >> FPSID_ARCH_BIT;  /* 
Extract the architecture version */
pr_cont("implementor %02x architecture %d part %02x variant %x 
rev %x\n",
Index: linux-2.6/include/linux/cpuhotplug.h
===
--- linux-2.6.orig/include/linux/cpuhotplug.h
+++ linux-2.6/include/linux/cpuhotplug.h
@@ -22,6 +22,7 @@ enum cpuhp_states {
CPUHP_AP_PERF_X86_UNCORE_STARTING,
CPUHP_AP_PERF_X86_AMD_IBS_STARTING,
CPUHP_AP_PERF_X86_STARTING,
+   CPUHP_AP_ARM_VFP_STARTING,
CPUHP_AP_ARM64_TIMER_STARTING,
CPUHP_AP_NOTIFY_STARTING,
CPUHP_AP_NOTIFY_DYING,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 21522 matches

Mail list logo