[RFC] ACPI vs device ordering on resume

2006-11-14 Thread Stephen Hemminger
If I do a suspend-to-ram then resume on a Sony Vaio laptop with sky2 driver,
the first interrupt gets misrouted to the original shared IRQ, rather than
to the MSI irq expected.

During the pci_restore process, the MSI information and the PCI command 
register 
are restored properly. But later during resume, inside the ACPI evaluation of
the WAK method, the PCI_COMMAND  INTX_DISABLE (0x400) flag is being cleared.
My guess is that the BIOS ends up doing some resetting of devices.

I may be able to workaround the problem for this one device, but it brings up
a more general issue about what the ordering should be during resume. If ACPI
evaluation (which I assume talks to the BIOS), might change device state, it
seems that ACPI code should execute before resuming devices not after. But 
changing
the order here seems drastic.

An alternate solution would be to have two pm_ops, one for early_resume
and another for late, and split the ACPI work.

--- 2.6.19-rc5.orig/kernel/power/main.c 2006-11-14 14:24:37.0 -0800
+++ 2.6.19-rc5/kernel/power/main.c  2006-11-14 14:25:23.0 -0800
@@ -132,12 +132,12 @@
 
 static void suspend_finish(suspend_state_t state)
 {
+   if (pm_ops && pm_ops->finish)
+   pm_ops->finish(state);
device_resume();
resume_console();
thaw_processes();
enable_nonboot_cpus();
-   if (pm_ops && pm_ops->finish)
-   pm_ops->finish(state);
pm_restore_console();
 }
 


-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] ACPI vs device ordering on resume

2006-11-14 Thread Linus Torvalds


On Tue, 14 Nov 2006, Stephen Hemminger wrote:
> 
> During the pci_restore process, the MSI information and the PCI command 
> register 
> are restored properly. But later during resume, inside the ACPI evaluation of
> the WAK method, the PCI_COMMAND  INTX_DISABLE (0x400) flag is being cleared.
> My guess is that the BIOS ends up doing some resetting of devices.

Hmm.. I think calling "pm_ops->finish()" early is the right thing to do, 
so your patch looks fine. I'd hate to apply it at this stage in the 2.6.19 
development, though. Comments?

Looking at the code (at least for ACPI), pm_ops->finish() function does 
things that we'd generally want done early. I'm a bit nervous about the 
"nesting", though - we call the "->prepare" function _before_ we do the 
device suspend stuff, so if "->finish" undoes something that was done by 
"->prepare", then we will restore the devices with the state they were in 
_after_ the "->prepare".

So from a logical _nesting_ perspective, the "->finish()" routine should 
happen after devices have been restored. And anything that ACPI does to 
undo "->enter" should have been done early by ACPI itself as it was 
exiting that "->enter" routine - that would make the things nest properly.

Gaah. Since there is no way to know what the HELL those ACPI routines will 
actually end up doing, there is no way to know what is the right answer. 
Does anybody know what Windows does? 

I think Stephen's patch is likely good, but I really don't see myself 
applying it to my tree right now. It should either go into -mm, or early 
into the post-2.6.19 tree (or, most likely, both).

Anybody?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [RFC] ACPI vs device ordering on resume

2006-11-14 Thread Len Brown
On Tuesday 14 November 2006 18:30, Stephen Hemminger wrote:
> If I do a suspend-to-ram then resume on a Sony Vaio laptop with sky2 driver,
> the first interrupt gets misrouted to the original shared IRQ, rather than
> to the MSI irq expected.
> 
> During the pci_restore process, the MSI information and the PCI command 
> register 
> are restored properly. But later during resume, inside the ACPI evaluation of
> the WAK method, the PCI_COMMAND  INTX_DISABLE (0x400) flag is being cleared.
> My guess is that the BIOS ends up doing some resetting of devices.
> 
> I may be able to workaround the problem for this one device, but it brings up
> a more general issue about what the ordering should be during resume. If ACPI
> evaluation (which I assume talks to the BIOS), might change device state, it
> seems that ACPI code should execute before resuming devices not after. But 
> changing
> the order here seems drastic.
> 
> An alternate solution would be to have two pm_ops, one for early_resume
> and another for late, and split the ACPI work.
> 
> --- 2.6.19-rc5.orig/kernel/power/main.c   2006-11-14 14:24:37.0 
> -0800
> +++ 2.6.19-rc5/kernel/power/main.c2006-11-14 14:25:23.0 -0800
> @@ -132,12 +132,12 @@
>  
>  static void suspend_finish(suspend_state_t state)
>  {
> + if (pm_ops && pm_ops->finish)
> + pm_ops->finish(state);
>   device_resume();
>   resume_console();
>   thaw_processes();
>   enable_nonboot_cpus();
> - if (pm_ops && pm_ops->finish)
> - pm_ops->finish(state);
>   pm_restore_console();
>  }

Yes, I agree that _WAK needs to come before device_resume().
Need to let any BIOS nasties happen and get over with before we restore device 
drivers.
This is consistent with the wording in ACPI 3.0b (section 7.4) that says
11. _WAK is run
12. OSPM notifies all native device drivefrs of the return from the sleep state 
transition

However, commit 1a38416cea8ac801ae8f261074721f35317613dc says that
_WAK must follow INIT -- ie finish() must come after enable_nonboot_cpus(),
and this patch as it stands would violate that.

So it looks like we need this sequence:

enable_nonboot_cpus() /* INIT */
finish()/* _WAK */
device_resume()

-Len
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [RFC] ACPI vs device ordering on resume

2006-11-15 Thread Rafael J. Wysocki
On Wednesday, 15 November 2006 08:03, Len Brown wrote:
> On Tuesday 14 November 2006 18:30, Stephen Hemminger wrote:
> > If I do a suspend-to-ram then resume on a Sony Vaio laptop with sky2 driver,
> > the first interrupt gets misrouted to the original shared IRQ, rather than
> > to the MSI irq expected.
> > 
> > During the pci_restore process, the MSI information and the PCI command 
> > register 
> > are restored properly. But later during resume, inside the ACPI evaluation 
> > of
> > the WAK method, the PCI_COMMAND  INTX_DISABLE (0x400) flag is being cleared.
> > My guess is that the BIOS ends up doing some resetting of devices.
> > 
> > I may be able to workaround the problem for this one device, but it brings 
> > up
> > a more general issue about what the ordering should be during resume. If 
> > ACPI
> > evaluation (which I assume talks to the BIOS), might change device state, it
> > seems that ACPI code should execute before resuming devices not after. But 
> > changing
> > the order here seems drastic.
> > 
> > An alternate solution would be to have two pm_ops, one for early_resume
> > and another for late, and split the ACPI work.
> > 
> > --- 2.6.19-rc5.orig/kernel/power/main.c 2006-11-14 14:24:37.0 
> > -0800
> > +++ 2.6.19-rc5/kernel/power/main.c  2006-11-14 14:25:23.0 -0800
> > @@ -132,12 +132,12 @@
> >  
> >  static void suspend_finish(suspend_state_t state)
> >  {
> > +   if (pm_ops && pm_ops->finish)
> > +   pm_ops->finish(state);
> > device_resume();
> > resume_console();
> > thaw_processes();
> > enable_nonboot_cpus();
> > -   if (pm_ops && pm_ops->finish)
> > -   pm_ops->finish(state);
> > pm_restore_console();
> >  }
> 
> Yes, I agree that _WAK needs to come before device_resume().
> Need to let any BIOS nasties happen and get over with before we restore 
> device drivers.
> This is consistent with the wording in ACPI 3.0b (section 7.4) that says
> 11. _WAK is run
> 12. OSPM notifies all native device drivefrs of the return from the sleep 
> state transition
> 
> However, commit 1a38416cea8ac801ae8f261074721f35317613dc says that
> _WAK must follow INIT -- ie finish() must come after enable_nonboot_cpus(),
> and this patch as it stands would violate that.
> 
> So it looks like we need this sequence:
> 
> enable_nonboot_cpus() /* INIT */
> finish()  /* _WAK */
> device_resume()

Which is a problem, because thaw_processes() is not SMP-safe.

Greetings,
Rafael


-- 
You never change things by fighting the existing reality.
R. Buckminster Fuller
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [RFC] ACPI vs device ordering on resume

2006-11-15 Thread Linus Torvalds


On Wed, 15 Nov 2006, Len Brown wrote:
> 
> So it looks like we need this sequence:
> 
> enable_nonboot_cpus() /* INIT */
> finish()  /* _WAK */
> device_resume()

Can somebody remind me about this immediately after 2.6.19?

No way am I going to make that kind of a major ordering change right now, 
especially as the thing isn't apparently even a real regression (just more 
fallout from enabling MSI).

It would be good if people who are affected (and people in general that 
are interested in power management) would test out the patch.

In particular, this is an even bigger change than the one suggested by 
Stephen. It means, for example, that we will have device resume (and 
process thawing) being called when we're in SMP mode - and that may or may 
not have issues. So this is a really scary patch to me, no way in hell 
will I apply it right now.

But if people try it out, it would be good..

Btw, this is a clear example of where it might be good to start actively 
using the "early_resume" thing, and do PCI bus resume there. We might want 
to do early_resume _before_ calling the bios (I'm not at all convinced 
that the firmware will do the right thing without the PCI buses set up, 
for example), and _before_ thawing processes. Then, we might let 
individual devices do their "device resume" in the normal late resume 
phase.

Greg already carries the patch around for that PCI bus early resume thing, 
I think. 

We need to test these things.

Linus


diff --git a/kernel/power/main.c b/kernel/power/main.c
index 873228c..2989609 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -132,12 +132,12 @@ int suspend_enter(suspend_state_t state)
 
 static void suspend_finish(suspend_state_t state)
 {
-   device_resume();
-   resume_console();
-   thaw_processes();
enable_nonboot_cpus();
if (pm_ops && pm_ops->finish)
pm_ops->finish(state);
+   device_resume();
+   resume_console();
+   thaw_processes();
pm_restore_console();
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [RFC] ACPI vs device ordering on resume

2006-11-30 Thread Stephen Hemminger
On Wed, 15 Nov 2006 02:03:30 -0500
Len Brown <[EMAIL PROTECTED]> wrote:

> On Tuesday 14 November 2006 18:30, Stephen Hemminger wrote:
> > If I do a suspend-to-ram then resume on a Sony Vaio laptop with sky2 driver,
> > the first interrupt gets misrouted to the original shared IRQ, rather than
> > to the MSI irq expected.
> > 
> > During the pci_restore process, the MSI information and the PCI command 
> > register 
> > are restored properly. But later during resume, inside the ACPI evaluation 
> > of
> > the WAK method, the PCI_COMMAND  INTX_DISABLE (0x400) flag is being cleared.
> > My guess is that the BIOS ends up doing some resetting of devices.
> > 
> > I may be able to workaround the problem for this one device, but it brings 
> > up
> > a more general issue about what the ordering should be during resume. If 
> > ACPI
> > evaluation (which I assume talks to the BIOS), might change device state, it
> > seems that ACPI code should execute before resuming devices not after. But 
> > changing
> > the order here seems drastic.
> > 
> > An alternate solution would be to have two pm_ops, one for early_resume
> > and another for late, and split the ACPI work.
> > 
> > --- 2.6.19-rc5.orig/kernel/power/main.c 2006-11-14 14:24:37.0 
> > -0800
> > +++ 2.6.19-rc5/kernel/power/main.c  2006-11-14 14:25:23.0 -0800
> > @@ -132,12 +132,12 @@
> >  
> >  static void suspend_finish(suspend_state_t state)
> >  {
> > +   if (pm_ops && pm_ops->finish)
> > +   pm_ops->finish(state);
> > device_resume();
> > resume_console();
> > thaw_processes();
> > enable_nonboot_cpus();
> > -   if (pm_ops && pm_ops->finish)
> > -   pm_ops->finish(state);
> > pm_restore_console();
> >  }
> 
> Yes, I agree that _WAK needs to come before device_resume().
> Need to let any BIOS nasties happen and get over with before we restore 
> device drivers.
> This is consistent with the wording in ACPI 3.0b (section 7.4) that says
> 11. _WAK is run
> 12. OSPM notifies all native device drivefrs of the return from the sleep 
> state transition
> 
> However, commit 1a38416cea8ac801ae8f261074721f35317613dc says that
> _WAK must follow INIT -- ie finish() must come after enable_nonboot_cpus(),
> and this patch as it stands would violate that.
> 
> So it looks like we need this sequence:
> 
> enable_nonboot_cpus() /* INIT */
> finish()  /* _WAK */
> device_resume()
> 

Do you want to do this, or shall I? send off a patch.
I can test on about 5 machines first.
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [RFC] ACPI vs device ordering on resume

2006-12-01 Thread Pavel Machek
Hi!

> > So it looks like we need this sequence:
> > 
> > enable_nonboot_cpus() /* INIT */
> > finish()/* _WAK */
> > device_resume()
> 
> Can somebody remind me about this immediately after 2.6.19?

Remind. But note that freezer is not yet SMP safe... Rafael is working
on that.
Pavel

> No way am I going to make that kind of a major ordering change right now, 
> especially as the thing isn't apparently even a real regression (just more 
> fallout from enabling MSI).
> 
> It would be good if people who are affected (and people in general that 
> are interested in power management) would test out the patch.
> 
> In particular, this is an even bigger change than the one suggested by 
> Stephen. It means, for example, that we will have device resume (and 
> process thawing) being called when we're in SMP mode - and that may or may 
> not have issues. So this is a really scary patch to me, no way in hell 
> will I apply it right now.
> 
> But if people try it out, it would be good..
> 
> Btw, this is a clear example of where it might be good to start actively 
> using the "early_resume" thing, and do PCI bus resume there. We might want 
> to do early_resume _before_ calling the bios (I'm not at all convinced 
> that the firmware will do the right thing without the PCI buses set up, 
> for example), and _before_ thawing processes. Then, we might let 
> individual devices do their "device resume" in the normal late resume 
> phase.
> 
> Greg already carries the patch around for that PCI bus early resume thing, 
> I think. 
> 
> We need to test these things.
> 
>   Linus
> 
> 
> diff --git a/kernel/power/main.c b/kernel/power/main.c
> index 873228c..2989609 100644
> --- a/kernel/power/main.c
> +++ b/kernel/power/main.c
> @@ -132,12 +132,12 @@ int suspend_enter(suspend_state_t state)
>  
>  static void suspend_finish(suspend_state_t state)
>  {
> - device_resume();
> - resume_console();
> - thaw_processes();
>   enable_nonboot_cpus();
>   if (pm_ops && pm_ops->finish)
>   pm_ops->finish(state);
> + device_resume();
> + resume_console();
> + thaw_processes();
>   pm_restore_console();
>  }
>  

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [RFC] ACPI vs device ordering on resume

2006-12-01 Thread Rafael J. Wysocki
On Friday, 1 December 2006 02:48, Stephen Hemminger wrote:
> On Wed, 15 Nov 2006 02:03:30 -0500
> Len Brown <[EMAIL PROTECTED]> wrote:
> 
> > On Tuesday 14 November 2006 18:30, Stephen Hemminger wrote:
> > > If I do a suspend-to-ram then resume on a Sony Vaio laptop with sky2 
> > > driver,
> > > the first interrupt gets misrouted to the original shared IRQ, rather than
> > > to the MSI irq expected.
> > > 
> > > During the pci_restore process, the MSI information and the PCI command 
> > > register 
> > > are restored properly. But later during resume, inside the ACPI 
> > > evaluation of
> > > the WAK method, the PCI_COMMAND  INTX_DISABLE (0x400) flag is being 
> > > cleared.
> > > My guess is that the BIOS ends up doing some resetting of devices.
> > > 
> > > I may be able to workaround the problem for this one device, but it 
> > > brings up
> > > a more general issue about what the ordering should be during resume. If 
> > > ACPI
> > > evaluation (which I assume talks to the BIOS), might change device state, 
> > > it
> > > seems that ACPI code should execute before resuming devices not after. 
> > > But changing
> > > the order here seems drastic.
> > > 
> > > An alternate solution would be to have two pm_ops, one for early_resume
> > > and another for late, and split the ACPI work.
> > > 
> > > --- 2.6.19-rc5.orig/kernel/power/main.c   2006-11-14 14:24:37.0 
> > > -0800
> > > +++ 2.6.19-rc5/kernel/power/main.c2006-11-14 14:25:23.0 
> > > -0800
> > > @@ -132,12 +132,12 @@
> > >  
> > >  static void suspend_finish(suspend_state_t state)
> > >  {
> > > + if (pm_ops && pm_ops->finish)
> > > + pm_ops->finish(state);
> > >   device_resume();
> > >   resume_console();
> > >   thaw_processes();
> > >   enable_nonboot_cpus();
> > > - if (pm_ops && pm_ops->finish)
> > > - pm_ops->finish(state);
> > >   pm_restore_console();
> > >  }
> > 
> > Yes, I agree that _WAK needs to come before device_resume().
> > Need to let any BIOS nasties happen and get over with before we restore 
> > device drivers.
> > This is consistent with the wording in ACPI 3.0b (section 7.4) that says
> > 11. _WAK is run
> > 12. OSPM notifies all native device drivefrs of the return from the sleep 
> > state transition
> > 
> > However, commit 1a38416cea8ac801ae8f261074721f35317613dc says that
> > _WAK must follow INIT -- ie finish() must come after enable_nonboot_cpus(),
> > and this patch as it stands would violate that.
> > 
> > So it looks like we need this sequence:
> > 
> > enable_nonboot_cpus() /* INIT */
> > finish()/* _WAK */
> > device_resume()
> > 
> 
> Do you want to do this, or shall I? send off a patch.
> I can test on about 5 machines first.

Could we please wait with that a bit until we have the freezer fixed?

Rafael


-- 
You never change things by fighting the existing reality.
R. Buckminster Fuller

-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [RFC] ACPI vs device ordering on resume

2006-12-01 Thread Rafael J. Wysocki
Hi,

On Friday, 1 December 2006 10:33, Pavel Machek wrote:
> Hi!
> 
> > > So it looks like we need this sequence:
> > > 
> > > enable_nonboot_cpus() /* INIT */
> > > finish()  /* _WAK */
> > > device_resume()
> > 
> > Can somebody remind me about this immediately after 2.6.19?
> 
> Remind. But note that freezer is not yet SMP safe... Rafael is working
> on that.

Yup.

BTW, have you looked at the last version of the patch for the handling of
stopped tasks (appended just in case, full discussion at:
http://lists.osdl.org/pipermail/linux-pm/2006-November/004214.html)?

Rafael


Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
---
 kernel/power/process.c |   36 ++--
 kernel/signal.c|2 +-
 2 files changed, 31 insertions(+), 7 deletions(-)

Index: linux-2.6.19-rc6-mm2/kernel/power/process.c
===
--- linux-2.6.19-rc6-mm2.orig/kernel/power/process.c
+++ linux-2.6.19-rc6-mm2/kernel/power/process.c
@@ -28,8 +28,7 @@ static inline int freezeable(struct task
if ((p == current) || 
(p->flags & PF_NOFREEZE) ||
(p->exit_state == EXIT_ZOMBIE) ||
-   (p->exit_state == EXIT_DEAD) ||
-   (p->state == TASK_STOPPED))
+   (p->exit_state == EXIT_DEAD))
return 0;
return 1;
 }
@@ -81,6 +80,11 @@ static void cancel_freezing(struct task_
}
 }
 
+static inline int stopped_and_freezing(struct task_struct *p)
+{
+   return p->state == TASK_STOPPED && freezing(p);
+}
+
 static inline int is_user_space(struct task_struct *p)
 {
return p->mm && !(p->flags & PF_BORROWED_MM);
@@ -103,9 +107,11 @@ static unsigned int try_to_freeze_tasks(
if (frozen(p))
continue;
 
-   if (p->state == TASK_TRACED &&
-   (frozen(p->parent) ||
-p->parent->state == TASK_STOPPED)) {
+   if (stopped_and_freezing(p))
+   continue;
+
+   if (p->state == TASK_TRACED && (frozen(p->parent) ||
+   stopped_and_freezing(p->parent))) {
cancel_freezing(p);
continue;
}
@@ -149,7 +155,8 @@ static unsigned int try_to_freeze_tasks(
if (is_user_space(p) == !freeze_user_space)
continue;
 
-   if (freezeable(p) && !frozen(p))
+   if (freezeable(p) && !frozen(p) &&
+   p->state != TASK_STOPPED && p->state != TASK_TRACED)
printk(KERN_ERR " %s\n", p->comm);
 
cancel_freezing(p);
@@ -185,6 +192,18 @@ int freeze_processes(void)
return 0;
 }
 
+static void release_stopped_tasks(void)
+{
+   struct task_struct *g, *p;
+
+   read_lock(&tasklist_lock);
+   do_each_thread(g, p) {
+   if (stopped_and_freezing(p))
+   cancel_freezing(p);
+   } while_each_thread(g, p);
+   read_unlock(&tasklist_lock);
+}
+
 static void thaw_tasks(int thaw_user_space)
 {
struct task_struct *g, *p;
@@ -197,6 +216,10 @@ static void thaw_tasks(int thaw_user_spa
if (is_user_space(p) == !thaw_user_space)
continue;
 
+   if (!frozen(p) &&
+   (p->state == TASK_STOPPED || p->state == TASK_TRACED))
+   continue;
+
if (!thaw_process(p))
printk(KERN_WARNING " Strange, %s not stopped\n",
p->comm );
@@ -207,6 +230,7 @@ static void thaw_tasks(int thaw_user_spa
 void thaw_processes(void)
 {
printk("Restarting tasks ... ");
+   release_stopped_tasks();
thaw_tasks(FREEZER_KERNEL_THREADS);
thaw_tasks(FREEZER_USER_SPACE);
schedule();
Index: linux-2.6.19-rc6-mm2/kernel/signal.c
===
--- linux-2.6.19-rc6-mm2.orig/kernel/signal.c
+++ linux-2.6.19-rc6-mm2/kernel/signal.c
@@ -1937,9 +1937,9 @@ int get_signal_to_deliver(siginfo_t *inf
sigset_t *mask = ¤t->blocked;
int signr = 0;
 
+relock:
try_to_freeze();
 
-relock:
spin_lock_irq(¤t->sighand->siglock);
for (;;) {
struct k_sigaction *ka;

-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [RFC] ACPI vs device ordering on resume

2006-12-01 Thread Pavel Machek
Hi!

> > > > So it looks like we need this sequence:
> > > > 
> > > > enable_nonboot_cpus() /* INIT */
> > > > finish()/* _WAK */
> > > > device_resume()
> > > 
> > > Can somebody remind me about this immediately after 2.6.19?
> > 
> > Remind. But note that freezer is not yet SMP safe... Rafael is working
> > on that.
> 
> Yup.
> 
> BTW, have you looked at the last version of the patch for the handling of
> stopped tasks (appended just in case, full discussion at:
> http://lists.osdl.org/pipermail/linux-pm/2006-November/004214.html)?

Well, I took a look.. and decided I'd like to find the place in kernel
where I can add try_to_freeze() and fix the TASK_STOPPED processes. I
hope such place exists.
Pavel 

> 
> Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
> ---
>  kernel/power/process.c |   36 ++--
>  kernel/signal.c|2 +-
>  2 files changed, 31 insertions(+), 7 deletions(-)
> 
> Index: linux-2.6.19-rc6-mm2/kernel/power/process.c
> ===
> --- linux-2.6.19-rc6-mm2.orig/kernel/power/process.c
> +++ linux-2.6.19-rc6-mm2/kernel/power/process.c
> @@ -28,8 +28,7 @@ static inline int freezeable(struct task
>   if ((p == current) || 
>   (p->flags & PF_NOFREEZE) ||
>   (p->exit_state == EXIT_ZOMBIE) ||
> - (p->exit_state == EXIT_DEAD) ||
> - (p->state == TASK_STOPPED))
> + (p->exit_state == EXIT_DEAD))
>   return 0;
>   return 1;
>  }
> @@ -81,6 +80,11 @@ static void cancel_freezing(struct task_
>   }
>  }
>  
> +static inline int stopped_and_freezing(struct task_struct *p)
> +{
> + return p->state == TASK_STOPPED && freezing(p);
> +}
> +
>  static inline int is_user_space(struct task_struct *p)
>  {
>   return p->mm && !(p->flags & PF_BORROWED_MM);
> @@ -103,9 +107,11 @@ static unsigned int try_to_freeze_tasks(
>   if (frozen(p))
>   continue;
>  
> - if (p->state == TASK_TRACED &&
> - (frozen(p->parent) ||
> -  p->parent->state == TASK_STOPPED)) {
> + if (stopped_and_freezing(p))
> + continue;
> +
> + if (p->state == TASK_TRACED && (frozen(p->parent) ||
> + stopped_and_freezing(p->parent))) {
>   cancel_freezing(p);
>   continue;
>   }
> @@ -149,7 +155,8 @@ static unsigned int try_to_freeze_tasks(
>   if (is_user_space(p) == !freeze_user_space)
>   continue;
>  
> - if (freezeable(p) && !frozen(p))
> + if (freezeable(p) && !frozen(p) &&
> + p->state != TASK_STOPPED && p->state != TASK_TRACED)
>   printk(KERN_ERR " %s\n", p->comm);
>  
>   cancel_freezing(p);
> @@ -185,6 +192,18 @@ int freeze_processes(void)
>   return 0;
>  }
>  
> +static void release_stopped_tasks(void)
> +{
> + struct task_struct *g, *p;
> +
> + read_lock(&tasklist_lock);
> + do_each_thread(g, p) {
> + if (stopped_and_freezing(p))
> + cancel_freezing(p);
> + } while_each_thread(g, p);
> + read_unlock(&tasklist_lock);
> +}
> +
>  static void thaw_tasks(int thaw_user_space)
>  {
>   struct task_struct *g, *p;
> @@ -197,6 +216,10 @@ static void thaw_tasks(int thaw_user_spa
>   if (is_user_space(p) == !thaw_user_space)
>   continue;
>  
> + if (!frozen(p) &&
> + (p->state == TASK_STOPPED || p->state == TASK_TRACED))
> + continue;
> +
>   if (!thaw_process(p))
>   printk(KERN_WARNING " Strange, %s not stopped\n",
>   p->comm );
> @@ -207,6 +230,7 @@ static void thaw_tasks(int thaw_user_spa
>  void thaw_processes(void)
>  {
>   printk("Restarting tasks ... ");
> + release_stopped_tasks();
>   thaw_tasks(FREEZER_KERNEL_THREADS);
>   thaw_tasks(FREEZER_USER_SPACE);
>   schedule();
> Index: linux-2.6.19-rc6-mm2/kernel/signal.c
> ===
> --- linux-2.6.19-rc6-mm2.orig/kernel/signal.c
> +++ linux-2.6.19-rc6-mm2/kernel/signal.c
> @@ -1937,9 +1937,9 @@ int get_signal_to_deliver(siginfo_t *inf
>   sigset_t *mask = ¤t->blocked;
>   int signr = 0;
>  
> +relock:
>   try_to_freeze();
>  
> -relock:
>   spin_lock_irq(¤t->sighand->siglock);
>   for (;;) {
>   struct k_sigaction *ka;

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "un

Re: [linux-pm] [RFC] ACPI vs device ordering on resume

2006-12-01 Thread Rafael J. Wysocki
Hi,

On Friday, 1 December 2006 11:57, Pavel Machek wrote:
> Hi!
> 
> > > > > So it looks like we need this sequence:
> > > > > 
> > > > > enable_nonboot_cpus() /* INIT */
> > > > > finish()  /* _WAK */
> > > > > device_resume()
> > > > 
> > > > Can somebody remind me about this immediately after 2.6.19?
> > > 
> > > Remind. But note that freezer is not yet SMP safe... Rafael is working
> > > on that.
> > 
> > Yup.
> > 
> > BTW, have you looked at the last version of the patch for the handling of
> > stopped tasks (appended just in case, full discussion at:
> > http://lists.osdl.org/pipermail/linux-pm/2006-November/004214.html)?
> 
> Well, I took a look.. and decided I'd like to find the place in kernel
> where I can add try_to_freeze() and fix the TASK_STOPPED processes. I
> hope such place exists.

Well, try_to_freeze() will only work for a process that has PF_FREEZE set, no?

So, if you want try_to_freeze() to work for a stopped process, you have to set
PF_FREEZE for this process, but this means that stopped processes cannot
be treated as unfreezeable ...

> > Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
> > ---
> >  kernel/power/process.c |   36 ++--
> >  kernel/signal.c|2 +-
> >  2 files changed, 31 insertions(+), 7 deletions(-)
> > 
> > Index: linux-2.6.19-rc6-mm2/kernel/power/process.c
> > ===
> > --- linux-2.6.19-rc6-mm2.orig/kernel/power/process.c
> > +++ linux-2.6.19-rc6-mm2/kernel/power/process.c
> > @@ -28,8 +28,7 @@ static inline int freezeable(struct task
> > if ((p == current) || 
> > (p->flags & PF_NOFREEZE) ||
> > (p->exit_state == EXIT_ZOMBIE) ||
> > -   (p->exit_state == EXIT_DEAD) ||
> > -   (p->state == TASK_STOPPED))
> > +   (p->exit_state == EXIT_DEAD))

... so this change is _necessary_ anyway.

Now if we don't treat stopped processes as unfreezeable, it doesn't make
sense to set PF_FREEZE for them more than once.  This it's reasonable to
check if we have already set PF_FREEZE for given stopped task and skip it
if so ...

> > return 0;
> > return 1;
> >  }
> > @@ -81,6 +80,11 @@ static void cancel_freezing(struct task_
> > }
> >  }
> >  
> > +static inline int stopped_and_freezing(struct task_struct *p)
> > +{
> > +   return p->state == TASK_STOPPED && freezing(p);
> > +}
> > +
> >  static inline int is_user_space(struct task_struct *p)
> >  {
> > return p->mm && !(p->flags & PF_BORROWED_MM);
> > @@ -103,9 +107,11 @@ static unsigned int try_to_freeze_tasks(
> > if (frozen(p))
> > continue;
> >  
> > -   if (p->state == TASK_TRACED &&
> > -   (frozen(p->parent) ||
> > -p->parent->state == TASK_STOPPED)) {
> > +   if (stopped_and_freezing(p))
> > +   continue;

... which is done here.

Now the condition for the freezing of traced tasks should be modified to
account for the fact that we no longer regard stopped tasks as unfreezeable ...

> > +
> > +   if (p->state == TASK_TRACED && (frozen(p->parent) ||
> > +   stopped_and_freezing(p->parent))) {

... so we need this change too.

> > cancel_freezing(p);
> > continue;
> > }
> > @@ -149,7 +155,8 @@ static unsigned int try_to_freeze_tasks(
> > if (is_user_space(p) == !freeze_user_space)
> > continue;
> >  
> > -   if (freezeable(p) && !frozen(p))
> > +   if (freezeable(p) && !frozen(p) &&
> > +   p->state != TASK_STOPPED && p->state != TASK_TRACED)
> > printk(KERN_ERR " %s\n", p->comm);

Now if the freezing of tasks fails, we don't want to confuse the user by
reporting that some stopped or traced tasks weren't frozen, so it's reasonable
to do the above, but it's really optional.

> >  
> > cancel_freezing(p);
> > @@ -185,6 +192,18 @@ int freeze_processes(void)
> > return 0;
> >  }
> >  
> > +static void release_stopped_tasks(void)
> > +{
> > +   struct task_struct *g, *p;
> > +
> > +   read_lock(&tasklist_lock);
> > +   do_each_thread(g, p) {
> > +   if (stopped_and_freezing(p))
> > +   cancel_freezing(p);
> > +   } while_each_thread(g, p);
> > +   read_unlock(&tasklist_lock);
> > +}
> > +
> >  static void thaw_tasks(int thaw_user_space)
> >  {
> > struct task_struct *g, *p;
> > @@ -197,6 +216,10 @@ static void thaw_tasks(int thaw_user_spa
> > if (is_user_space(p) == !thaw_user_space)
> > continue;
> >  
> > +   if (!frozen(p) &&
> > +   (p->state == TASK_STOPPED || p->state == TASK_TRACED))
> > +   continue;
> > +

Again, we shouldn't be confusing the user by reporti

Re: [linux-pm] [RFC] ACPI vs device ordering on resume

2006-12-01 Thread Linus Torvalds


On Fri, 1 Dec 2006, Pavel Machek wrote:

> > > So it looks like we need this sequence:
> > > 
> > > enable_nonboot_cpus() /* INIT */
> > > finish()  /* _WAK */
> > > device_resume()
> > 
> > Can somebody remind me about this immediately after 2.6.19?
> 
> Remind. But note that freezer is not yet SMP safe... Rafael is working
> on that.

Thanks.

On the other hand, I really wonder (and suspect) whether the problem isn't 
really the freezer or even the kernel resume ordering, but simply an ACPI 
internal resume ordering thing.

Doesn't ACPI have per-device "WAK" calls anyway? Shouldn't we just call 
those _individually_ as we walk the device tree (perhaps in the 
"early_resume" stage) rather than calling them all in one chunk?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [RFC] ACPI vs device ordering on resume

2006-12-01 Thread Alexey Starikovskiy

Linus Torvalds wrote:

On Fri, 1 Dec 2006, Pavel Machek wrote:

  

So it looks like we need this sequence:

enable_nonboot_cpus() /* INIT */
finish()/* _WAK */
device_resume()


Can somebody remind me about this immediately after 2.6.19?
  

Remind. But note that freezer is not yet SMP safe... Rafael is working
on that.



Thanks.

On the other hand, I really wonder (and suspect) whether the problem isn't 
really the freezer or even the kernel resume ordering, but simply an ACPI 
internal resume ordering thing.


Doesn't ACPI have per-device "WAK" calls anyway? Shouldn't we just call 
those _individually_ as we walk the device tree (perhaps in the 
"early_resume" stage) rather than calling them all in one chunk?


Linus
  
_WAK method is system-wide. Individual objects do not have their own 
resume methods.
One way of reordering internal ACPI resume is done in patch series to 
7122, I mentioned that earlier.

It's possible to resume ACPI devices after execution of _WAK in pm->finish.

Alex.
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [RFC] ACPI vs device ordering on resume

2006-12-01 Thread Stephen Hemminger
On Fri, 01 Dec 2006 20:45:51 +0300
Alexey Starikovskiy <[EMAIL PROTECTED]> wrote:

> Linus Torvalds wrote:
> > On Fri, 1 Dec 2006, Pavel Machek wrote:
> >
> >   
>  So it looks like we need this sequence:
> 
>  enable_nonboot_cpus() /* INIT */
>  finish() /* _WAK */
>  device_resume()
>  
> >>> Can somebody remind me about this immediately after 2.6.19?
> >>>   
> >> Remind. But note that freezer is not yet SMP safe... Rafael is working
> >> on that.
> >> 
> >
> > Thanks.
> >
> > On the other hand, I really wonder (and suspect) whether the problem isn't 
> > really the freezer or even the kernel resume ordering, but simply an ACPI 
> > internal resume ordering thing.
> >
> > Doesn't ACPI have per-device "WAK" calls anyway? Shouldn't we just call 
> > those _individually_ as we walk the device tree (perhaps in the 
> > "early_resume" stage) rather than calling them all in one chunk?
> >
> > Linus
> >   
> _WAK method is system-wide. Individual objects do not have their own 
> resume methods.
> One way of reordering internal ACPI resume is done in patch series to 
> 7122, I mentioned that earlier.
> It's possible to resume ACPI devices after execution of _WAK in pm->finish.

Does it solve the original problem where finish() was getting run after
device_resume(), and finish() was corrupting PCI register settings like
MSI?


-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-pm] [RFC] ACPI vs device ordering on resume

2006-12-01 Thread Alexey Starikovskiy

Stephen Hemminger wrote:

On Fri, 01 Dec 2006 20:45:51 +0300
Alexey Starikovskiy <[EMAIL PROTECTED]> wrote:

  

Linus Torvalds wrote:


On Fri, 1 Dec 2006, Pavel Machek wrote:

  
  

So it looks like we need this sequence:

enable_nonboot_cpus() /* INIT */
finish()/* _WAK */
device_resume()



Can somebody remind me about this immediately after 2.6.19?
  
  

Remind. But note that freezer is not yet SMP safe... Rafael is working
on that.



Thanks.

On the other hand, I really wonder (and suspect) whether the problem isn't 
really the freezer or even the kernel resume ordering, but simply an ACPI 
internal resume ordering thing.


Doesn't ACPI have per-device "WAK" calls anyway? Shouldn't we just call 
those _individually_ as we walk the device tree (perhaps in the 
"early_resume" stage) rather than calling them all in one chunk?


Linus
  
  
_WAK method is system-wide. Individual objects do not have their own 
resume methods.
One way of reordering internal ACPI resume is done in patch series to 
7122, I mentioned that earlier.

It's possible to resume ACPI devices after execution of _WAK in pm->finish.



Does it solve the original problem where finish() was getting run after
device_resume(), and finish() was corrupting PCI register settings like
MSI?

  

No, only the case with ACPI devices... all other devices are untouched.
-
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html