Re: freeze vs freezer

2008-01-05 Thread Nigel Cunningham
Hi.

Pavel Machek wrote:
> On Fri 2008-01-04 21:54:06, Oliver Neukum wrote:
>> Am Donnerstag, 3. Januar 2008 23:06:07 schrieb Nigel Cunningham:
>>> Oliver Neukum wrote:
 Am Donnerstag, 3. Januar 2008 10:52:53 schrieb Nigel Cunningham:
> Oliver Neukum wrote:
>> Am Donnerstag 03 Januar 2008 schrieb Nigel Cunningham:
>>> On top of this, I made a (too simple at the moment) freeze_filesystems
>>> function which iterates through &super_blocks in reverse order, freezing
>>> fuse filesystems or ordinary ones. I say 'too simple' because it doesn't
>>> currently allow for the possibility of someone mounting (say) ext3 on
>>> fuse, but that would just be an extension of what's already done.
>> How do you deal with fuse server tasks using other fuse filesystems?
> Since they're frozen in reverse order, the dependant one would be frozen
> first.
 Say I do:

 a) mount fuse on /tmp/first
 b) mount fuse on /tmp/second

 Then the server task for (a) does "ls /tmp/second". So it will be frozen,
 right? How do you then freeze (a)? And keep in mind that the server task
 may have forked.
>>> I guess I should first ask, is this a real life problem or a
>>> hypothetical twisted web? I don't see why you would want to make two
>>> filesystems interdependent - it sounds like the way to create livelock
>>> and deadlocks in normal use, before we even begin to think about
>>> hibernating.
>> Good questions. I personally don't use fuse, but I do care about power
>> management. The problem I see is that an unprivileged user could make
>> that dependency, even inadvertedly.
> 
> Other problem is that unprivileged user can do it with evil intent. So
> called "denial-of-service" attack.

Only in this case it would be a denial-of-denial-of-service attack,
since it would stop you hibernating or suspending :).

This is still all hypothetical. If I could have a real life case where
this could actually happen, it would help a lot.

Nigel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-05 Thread Pavel Machek
On Fri 2008-01-04 21:54:06, Oliver Neukum wrote:
> Am Donnerstag, 3. Januar 2008 23:06:07 schrieb Nigel Cunningham:
> > Hi.
> > 
> > Oliver Neukum wrote:
> > > Am Donnerstag, 3. Januar 2008 10:52:53 schrieb Nigel Cunningham:
> > >> Hi.
> > >>
> > >> Oliver Neukum wrote:
> > >>> Am Donnerstag 03 Januar 2008 schrieb Nigel Cunningham:
> >  On top of this, I made a (too simple at the moment) freeze_filesystems
> >  function which iterates through &super_blocks in reverse order, 
> >  freezing
> >  fuse filesystems or ordinary ones. I say 'too simple' because it 
> >  doesn't
> >  currently allow for the possibility of someone mounting (say) ext3 on
> >  fuse, but that would just be an extension of what's already done.
> > >>> How do you deal with fuse server tasks using other fuse filesystems?
> > >> Since they're frozen in reverse order, the dependant one would be frozen
> > >> first.
> > > 
> > > Say I do:
> > > 
> > > a) mount fuse on /tmp/first
> > > b) mount fuse on /tmp/second
> > > 
> > > Then the server task for (a) does "ls /tmp/second". So it will be frozen,
> > > right? How do you then freeze (a)? And keep in mind that the server task
> > > may have forked.
> > 
> > I guess I should first ask, is this a real life problem or a
> > hypothetical twisted web? I don't see why you would want to make two
> > filesystems interdependent - it sounds like the way to create livelock
> > and deadlocks in normal use, before we even begin to think about
> > hibernating.
> 
> Good questions. I personally don't use fuse, but I do care about power
> management. The problem I see is that an unprivileged user could make
> that dependency, even inadvertedly.

Other problem is that unprivileged user can do it with evil intent. So
called "denial-of-service" attack.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-04 Thread Kyle Moffett

On Jan 04, 2008, at 15:54:06, Oliver Neukum wrote:

Am Donnerstag, 3. Januar 2008 23:06:07 schrieb Nigel Cunningham:

Hi.

a) mount fuse on /tmp/first
b) mount fuse on /tmp/second

Then the server task for (a) does "ls /tmp/second". So it will be  
frozen, right? How do you then freeze (a)? And keep in mind that  
the server task may have forked.


I guess I should first ask, is this a real life problem or a  
hypothetical twisted web? I don't see why you would want to make  
two filesystems interdependent - it sounds like the way to create  
livelock and deadlocks in normal use, before we even begin to  
think about hibernating.


Good questions. I personally don't use fuse, but I do care about  
power management. The problem I see is that an unprivileged user  
could make that dependency, even inadvertedly.


I don't think it makes sense for the kernel to try to keep track of  
hard data dependencies for FUSE filesystems, or to even *attempt* to  
auto-suspend them.  You should instead allow a privileged program to  
initiate a "freeze-and-flush" operation on a particular FUSE  
filesystem and optionally wait for it to finish.  Then your userspace  
would be configured with the appropriate data dependencies and would  
stop FUSE filesystems in the appropriate order.


In addition, the kernel would automatically understand  
ext3=>loopback=>fuse, and when asked to freeze the "fuse" part, it  
would first freeze the "ext3" and the "loopback" parts using similar  
mechanisms as device-mapper currently uses when you do "dmsetup  
suspend mydev" followed by "echo 0 $SIZE snapshot /dev/mapper/mydev- 
base /dev/mapper/mydev-snap-back p 8 | dmsetup load mydev"  (IE: when  
you create a snapshot of a given device).


Naturally userspace could deadlock itself (although not the kernel)  
by freezing a block device and then attempting to access it, but  
since the "freeze" operation is limited to root this is not a big  
issue.  The way to freeze all filesystems safely would be to clone a  
new mount namespace, mlockall(), mount a tmpfs, pivot_root() into the  
tmpfs, bind-mount the filesystems you want to freeze directly onto  
subdirectories of the tmpfs, and then freeze them in an appropriate  
order.


Besides which the worst-case is a pretty straightforward non-critical  
failure; you might fail to fully sync a FUSE filesystem because its  
daemon is asleep waiting on something (possibly even just sitting in  
a "sleep(1)" call with all signals masked).  You simply need to  
make sure that all tasks are asleep outside of driver critical  
sections so that you can properly suspend your device tree.


Cheers,
Kyle Moffett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-04 Thread Oliver Neukum
Am Donnerstag, 3. Januar 2008 23:06:07 schrieb Nigel Cunningham:
> Hi.
> 
> Oliver Neukum wrote:
> > Am Donnerstag, 3. Januar 2008 10:52:53 schrieb Nigel Cunningham:
> >> Hi.
> >>
> >> Oliver Neukum wrote:
> >>> Am Donnerstag 03 Januar 2008 schrieb Nigel Cunningham:
>  On top of this, I made a (too simple at the moment) freeze_filesystems
>  function which iterates through &super_blocks in reverse order, freezing
>  fuse filesystems or ordinary ones. I say 'too simple' because it doesn't
>  currently allow for the possibility of someone mounting (say) ext3 on
>  fuse, but that would just be an extension of what's already done.
> >>> How do you deal with fuse server tasks using other fuse filesystems?
> >> Since they're frozen in reverse order, the dependant one would be frozen
> >> first.
> > 
> > Say I do:
> > 
> > a) mount fuse on /tmp/first
> > b) mount fuse on /tmp/second
> > 
> > Then the server task for (a) does "ls /tmp/second". So it will be frozen,
> > right? How do you then freeze (a)? And keep in mind that the server task
> > may have forked.
> 
> I guess I should first ask, is this a real life problem or a
> hypothetical twisted web? I don't see why you would want to make two
> filesystems interdependent - it sounds like the way to create livelock
> and deadlocks in normal use, before we even begin to think about
> hibernating.

Good questions. I personally don't use fuse, but I do care about power
management. The problem I see is that an unprivileged user could make
that dependency, even inadvertedly.

Regards
Oliver

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-03 Thread Rafael J. Wysocki
On Thursday, 3 of January 2008, Nigel Cunningham wrote:
> Hi.
> 
> Rafael J. Wysocki wrote:
> > On Wednesday, 2 of January 2008, Nigel Cunningham wrote:
> >> Pavel Machek wrote:
>  So how do you handle threads that are blocked on I/O or a lock  
>  during the system freeze process, then?
> >>> We wait until they can continue.
> >> So if I have a process blocked on an unavilable NFS mount, I can't
> >> suspend?
> > That's correct, you can't.
> >
> > [And I know what you're going to say. ;-)]
>  Why exactly does suspend/hibernation depend on "TASK_INTERRUPTIBLE"  
>  instead of a zero preempt_count()?  Really what we should do is just  
>  iterate over all of the actual physical devices and tell each one  
>  "Block new IO requests preemptably, finish pending DMA, put the  
>  hardware in low-power mode, and prepare for suspend/hibernate".  As  
>  long as each driver knows how to do those simple things we can have  
>  an entirely consistent kernel image for both suspend and for  
>  hibernation.
> >>> "each driver" means this is a lot of work. But yes, that is probably
> >>> way to go, and patch would be welcome.
> >> Yes, that does work. It's what I've done in my (preliminary) support for
> >> fuse.
> > 
> > Hmm, can you please elaborate a bit?
> 
> Sorry. I wasn't very unambiguous, was I? And I'm not sure now whether
> you're meaning "How does fuse support relate to freezing block devices?"
> or "What's this about fuse support?". Let me therefore seek to answer
> both questions:
> 
> Higher level, I know (filesystems rather than block devices), but I was
> meaning the general concept of blocking new requests and completing
> existing ones worked fine for the supposedly impossible fuse support.
> 
> Re fuse support, let me start by saying "I know this doesn't handle all
> situations, but I think it's a good enough proof-of-concept implementation".
> 
> I added some simple hooks to the code for submitting new work to fuse
> threads.
> 
> #define FUSE_MIGHT_FREEZE(superblock, desc) \
> do { \
>int printed = 0; \
>while(superblock->s_frozen != SB_UNFROZEN) { \
>if (!printed) { \
>printk("%d frozen in " desc ".\n", current->pid); \
>printed = 1; \
>} \
>try_to_freeze(); \
>yield(); \
>} \
> } while (0)
> 
> On top of this, I made a (too simple at the moment) freeze_filesystems
> function which iterates through &super_blocks in reverse order, freezing
> fuse filesystems or ordinary ones. I say 'too simple' because it doesn't
> currently allow for the possibility of someone mounting (say) ext3 on
> fuse, but that would just be an extension of what's already done.
> 
> The end result is:
> 
> int freeze_processes(void)
> {
> int error;
> 
> printk(KERN_INFO "Stopping fuse filesystems.\n");
> freeze_filesystems(FS_FREEZER_FUSE);
> freezer_state = FREEZER_FILESYSTEMS_FROZEN;
> printk(KERN_INFO "Freezing user space processes ... ");
> error = try_to_freeze_tasks(FREEZER_USER_SPACE);
> if (error)
> goto Exit;
> printk(KERN_INFO "done.\n");
> 
> sys_sync();
> printk(KERN_INFO "Stopping normal filesystems.\n");
> freeze_filesystems(FS_FREEZER_NORMAL);
> freezer_state = FREEZER_USERSPACE_FROZEN;
> printk(KERN_INFO "Freezing remaining freezable tasks ... ");
> error = try_to_freeze_tasks(FREEZER_KERNEL_THREADS);
> if (error)
> goto Exit;
> printk(KERN_INFO "done.");
> freezer_state = FREEZER_FULLY_ON;
>  Exit:
> BUG_ON(in_atomic());
> printk("\n");
> return error;
> }
> 
> Sorry if that's more info than you wanted.

No, that's fine, thanks.

Greetings,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-03 Thread Nigel Cunningham
Hi.

Oliver Neukum wrote:
> Am Donnerstag, 3. Januar 2008 10:52:53 schrieb Nigel Cunningham:
>> Hi.
>>
>> Oliver Neukum wrote:
>>> Am Donnerstag 03 Januar 2008 schrieb Nigel Cunningham:
 On top of this, I made a (too simple at the moment) freeze_filesystems
 function which iterates through &super_blocks in reverse order, freezing
 fuse filesystems or ordinary ones. I say 'too simple' because it doesn't
 currently allow for the possibility of someone mounting (say) ext3 on
 fuse, but that would just be an extension of what's already done.
>>> How do you deal with fuse server tasks using other fuse filesystems?
>> Since they're frozen in reverse order, the dependant one would be frozen
>> first.
> 
> Say I do:
> 
> a) mount fuse on /tmp/first
> b) mount fuse on /tmp/second
> 
> Then the server task for (a) does "ls /tmp/second". So it will be frozen,
> right? How do you then freeze (a)? And keep in mind that the server task
> may have forked.

I guess I should first ask, is this a real life problem or a
hypothetical twisted web? I don't see why you would want to make two
filesystems interdependent - it sounds like the way to create livelock
and deadlocks in normal use, before we even begin to think about
hibernating.

Regards,

Nigel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-03 Thread Oliver Neukum
Am Donnerstag, 3. Januar 2008 10:52:53 schrieb Nigel Cunningham:
> Hi.
> 
> Oliver Neukum wrote:
> > Am Donnerstag 03 Januar 2008 schrieb Nigel Cunningham:
> >> On top of this, I made a (too simple at the moment) freeze_filesystems
> >> function which iterates through &super_blocks in reverse order, freezing
> >> fuse filesystems or ordinary ones. I say 'too simple' because it doesn't
> >> currently allow for the possibility of someone mounting (say) ext3 on
> >> fuse, but that would just be an extension of what's already done.
> > 
> > How do you deal with fuse server tasks using other fuse filesystems?
> 
> Since they're frozen in reverse order, the dependant one would be frozen
> first.

Say I do:

a) mount fuse on /tmp/first
b) mount fuse on /tmp/second

Then the server task for (a) does "ls /tmp/second". So it will be frozen,
right? How do you then freeze (a)? And keep in mind that the server task
may have forked.

Regards
Oliver
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-03 Thread Nigel Cunningham
Hi.

Oliver Neukum wrote:
> Am Donnerstag 03 Januar 2008 schrieb Nigel Cunningham:
>> On top of this, I made a (too simple at the moment) freeze_filesystems
>> function which iterates through &super_blocks in reverse order, freezing
>> fuse filesystems or ordinary ones. I say 'too simple' because it doesn't
>> currently allow for the possibility of someone mounting (say) ext3 on
>> fuse, but that would just be an extension of what's already done.
> 
> How do you deal with fuse server tasks using other fuse filesystems?

Since they're frozen in reverse order, the dependant one would be frozen
first.

> How does freeze_filesystems() look?

Removing my ugly debugging statements, it's currently:

/**
 * freeze_filesystems - lock all filesystems and force them into a
consistent
 * state
 */
void freeze_filesystems(int which)
{
struct super_block *sb;

lockdep_off();

/*
 * Freeze in reverse order so filesystems dependant upon others are
 * frozen in the right order (eg. loopback on ext3).
 */
list_for_each_entry_reverse(sb, &super_blocks, s_list) {
if (sb->s_type->fs_flags & FS_IS_FUSE &&
sb->s_frozen == SB_UNFROZEN &&
which & FS_FREEZER_FUSE) {
sb->s_frozen = SB_FREEZE_TRANS;
sb->s_flags |= MS_FROZEN;
continue;
}

if (!sb->s_root || !sb->s_bdev ||
(sb->s_frozen == SB_FREEZE_TRANS) ||
(sb->s_flags & MS_RDONLY) ||
(sb->s_flags & MS_FROZEN) ||
!(which & FS_FREEZER_NORMAL))
continue;
freeze_bdev(sb->s_bdev);
sb->s_flags |= MS_FROZEN;
}

lockdep_on();
}

Nigel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-03 Thread Oliver Neukum
Am Donnerstag 03 Januar 2008 schrieb Nigel Cunningham:
> On top of this, I made a (too simple at the moment) freeze_filesystems
> function which iterates through &super_blocks in reverse order, freezing
> fuse filesystems or ordinary ones. I say 'too simple' because it doesn't
> currently allow for the possibility of someone mounting (say) ext3 on
> fuse, but that would just be an extension of what's already done.

How do you deal with fuse server tasks using other fuse filesystems?
How does freeze_filesystems() look?

Regards
Oliver

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-03 Thread Nigel Cunningham
Hi.

Rafael J. Wysocki wrote:
> On Wednesday, 2 of January 2008, Nigel Cunningham wrote:
>> Pavel Machek wrote:
 So how do you handle threads that are blocked on I/O or a lock  
 during the system freeze process, then?
>>> We wait until they can continue.
>> So if I have a process blocked on an unavilable NFS mount, I can't
>> suspend?
> That's correct, you can't.
>
> [And I know what you're going to say. ;-)]
 Why exactly does suspend/hibernation depend on "TASK_INTERRUPTIBLE"  
 instead of a zero preempt_count()?  Really what we should do is just  
 iterate over all of the actual physical devices and tell each one  
 "Block new IO requests preemptably, finish pending DMA, put the  
 hardware in low-power mode, and prepare for suspend/hibernate".  As  
 long as each driver knows how to do those simple things we can have  
 an entirely consistent kernel image for both suspend and for  
 hibernation.
>>> "each driver" means this is a lot of work. But yes, that is probably
>>> way to go, and patch would be welcome.
>> Yes, that does work. It's what I've done in my (preliminary) support for
>> fuse.
> 
> Hmm, can you please elaborate a bit?

Sorry. I wasn't very unambiguous, was I? And I'm not sure now whether
you're meaning "How does fuse support relate to freezing block devices?"
or "What's this about fuse support?". Let me therefore seek to answer
both questions:

Higher level, I know (filesystems rather than block devices), but I was
meaning the general concept of blocking new requests and completing
existing ones worked fine for the supposedly impossible fuse support.

Re fuse support, let me start by saying "I know this doesn't handle all
situations, but I think it's a good enough proof-of-concept implementation".

I added some simple hooks to the code for submitting new work to fuse
threads.

#define FUSE_MIGHT_FREEZE(superblock, desc) \
do { \
   int printed = 0; \
   while(superblock->s_frozen != SB_UNFROZEN) { \
   if (!printed) { \
   printk("%d frozen in " desc ".\n", current->pid); \
   printed = 1; \
   } \
   try_to_freeze(); \
   yield(); \
   } \
} while (0)

On top of this, I made a (too simple at the moment) freeze_filesystems
function which iterates through &super_blocks in reverse order, freezing
fuse filesystems or ordinary ones. I say 'too simple' because it doesn't
currently allow for the possibility of someone mounting (say) ext3 on
fuse, but that would just be an extension of what's already done.

The end result is:

int freeze_processes(void)
{
int error;

printk(KERN_INFO "Stopping fuse filesystems.\n");
freeze_filesystems(FS_FREEZER_FUSE);
freezer_state = FREEZER_FILESYSTEMS_FROZEN;
printk(KERN_INFO "Freezing user space processes ... ");
error = try_to_freeze_tasks(FREEZER_USER_SPACE);
if (error)
goto Exit;
printk(KERN_INFO "done.\n");

sys_sync();
printk(KERN_INFO "Stopping normal filesystems.\n");
freeze_filesystems(FS_FREEZER_NORMAL);
freezer_state = FREEZER_USERSPACE_FROZEN;
printk(KERN_INFO "Freezing remaining freezable tasks ... ");
error = try_to_freeze_tasks(FREEZER_KERNEL_THREADS);
if (error)
goto Exit;
printk(KERN_INFO "done.");
freezer_state = FREEZER_FULLY_ON;
 Exit:
BUG_ON(in_atomic());
printk("\n");
return error;
}

Sorry if that's more info than you wanted.

Nigel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-02 Thread Rafael J. Wysocki
On Wednesday, 2 of January 2008, Nigel Cunningham wrote:
> Hi.
> 
> Pavel Machek wrote:
> > Hi!
> > 
> >> So how do you handle threads that are blocked on I/O or a lock  
> >> during the system freeze process, then?
> > We wait until they can continue.
>  So if I have a process blocked on an unavilable NFS mount, I can't
>  suspend?
> >>> That's correct, you can't.
> >>>
> >>> [And I know what you're going to say. ;-)]
> >> Why exactly does suspend/hibernation depend on "TASK_INTERRUPTIBLE"  
> >> instead of a zero preempt_count()?  Really what we should do is just  
> >> iterate over all of the actual physical devices and tell each one  
> >> "Block new IO requests preemptably, finish pending DMA, put the  
> >> hardware in low-power mode, and prepare for suspend/hibernate".  As  
> >> long as each driver knows how to do those simple things we can have  
> >> an entirely consistent kernel image for both suspend and for  
> >> hibernation.
> > 
> > "each driver" means this is a lot of work. But yes, that is probably
> > way to go, and patch would be welcome.
> 
> Yes, that does work. It's what I've done in my (preliminary) support for
> fuse.

Hmm, can you please elaborate a bit?

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-02 Thread Nigel Cunningham
Hi.

Pavel Machek wrote:
> Hi!
> 
>> So how do you handle threads that are blocked on I/O or a lock  
>> during the system freeze process, then?
> We wait until they can continue.
 So if I have a process blocked on an unavilable NFS mount, I can't
 suspend?
>>> That's correct, you can't.
>>>
>>> [And I know what you're going to say. ;-)]
>> Why exactly does suspend/hibernation depend on "TASK_INTERRUPTIBLE"  
>> instead of a zero preempt_count()?  Really what we should do is just  
>> iterate over all of the actual physical devices and tell each one  
>> "Block new IO requests preemptably, finish pending DMA, put the  
>> hardware in low-power mode, and prepare for suspend/hibernate".  As  
>> long as each driver knows how to do those simple things we can have  
>> an entirely consistent kernel image for both suspend and for  
>> hibernation.
> 
> "each driver" means this is a lot of work. But yes, that is probably
> way to go, and patch would be welcome.

Yes, that does work. It's what I've done in my (preliminary) support for
fuse.

Regards,

Nigel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2008-01-02 Thread Pavel Machek
Hi!

> So how do you handle threads that are blocked on I/O or a lock  
> during the system freeze process, then?
> >>>
> >>>We wait until they can continue.
> >>
> >>So if I have a process blocked on an unavilable NFS mount, I can't
> >>suspend?
> >
> >That's correct, you can't.
> >
> >[And I know what you're going to say. ;-)]
> 
> Why exactly does suspend/hibernation depend on "TASK_INTERRUPTIBLE"  
> instead of a zero preempt_count()?  Really what we should do is just  
> iterate over all of the actual physical devices and tell each one  
> "Block new IO requests preemptably, finish pending DMA, put the  
> hardware in low-power mode, and prepare for suspend/hibernate".  As  
> long as each driver knows how to do those simple things we can have  
> an entirely consistent kernel image for both suspend and for  
> hibernation.

"each driver" means this is a lot of work. But yes, that is probably
way to go, and patch would be welcome.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2007-11-27 Thread Jeremy Fitzhardinge
Kyle Moffett wrote:
> On Nov 27, 2007, at 17:49:18, Jeremy Fitzhardinge wrote:
>> Rafael J. Wysocki wrote:
>>> Well, this is more-or-less how we all imagine that should be done
>>> eventually.
>>>
>>> The main problem is how to implement it without causing too much
>>> breakage.  Also, there are some dirty details that need to be taken
>>> into consideration.
>>
>> For Xen suspend/resume, I'd like to use the freezer to get all
>> threads into a known consistent state (where, specifically, they
>> don't have any outstanding pagetable updates pending).  In other
>> words, the freezer as it currently stands is what I want, modulo some
>> of these issues where it gets caught up unexpectedly.  If threads end
>> up getting frozen anywhere preempt isn't explicitly disabled, it
>> wouldn't work for me.
>
> The problem with "one freezer" is that "known consistent state" means
> something completely different to every single driver and subsystem.

Not really.  The freezer puts tasks into a particular well-understood
state: they're either in usermode, or in the kernel in the
refrigerator.  And since the places which call into the refrigerator are
explicit in the source, and not terribly numerous, its easy to audit
exactly what the state is at each call.

> Xen wants it to mean "No pending page table updates and no more
> updates from this point forward".  A network driver wants it to mean
> "All pending network packets DMAed out or in and the device shut down
> with all remaining packets queued.  A SATA controller wants it to mean
> "All DMA quiesced and no more commands", etc.

Well, those are somewhat different.  The existing suspend/resume driver
callbacks are sufficient for a device to be in that state.  What I want
for Xen is more global: I just want to make sure tasks are not preempted
in the middle of a state which can't be suspended.  The specific details
of the state I want are moderately complex, but short lived.  The
problem with other mechanisms - like stop_machine - is that they can
leave threads preempted in one of the states I can't handle, whereas the
the freezer is more deterministic.

> The only way to have that work is to put minimal definitions of what
> state you care about in the drivers themselves.  For Xen this means
> that you need to have an appropriately-timed suspend handler which
> hooks into Xen code very precisely to create and preserve the "No
> pending page table updates" state that you care about.  It will be
> more work in the short term but it's the only maintainable solution in
> the long term IMO.

No, that doesn't really work.  Aside from scattering hooks everywhere
there's pagetable updates, there's no real existing place to hook into. 
While I could put those hooks in, they would amount to changing the
kernel-internal pagetable update interface for everyone to deal with a
corner case of a fairly obscure user - I don't think its a good tradeoff.

The freezer is nice because the state it puts each task into is
well-defined, and is well-suited for Xen's use.  In fact, I would agree
with you that the use I want to put the freezer to better suits it than
its current use in suspend/resume.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2007-11-27 Thread Kyle Moffett

On Nov 27, 2007, at 17:49:18, Jeremy Fitzhardinge wrote:

Rafael J. Wysocki wrote:
Well, this is more-or-less how we all imagine that should be done  
eventually.


The main problem is how to implement it without causing too much  
breakage.  Also, there are some dirty details that need to be  
taken into consideration.


For Xen suspend/resume, I'd like to use the freezer to get all  
threads into a known consistent state (where, specifically, they  
don't have any outstanding pagetable updates pending).  In other  
words, the freezer as it currently stands is what I want, modulo  
some of these issues where it gets caught up unexpectedly.  If  
threads end up getting frozen anywhere preempt isn't explicitly  
disabled, it wouldn't work for me.


The problem with "one freezer" is that "known consistent state" means  
something completely different to every single driver and subsystem.   
Xen wants it to mean "No pending page table updates and no more  
updates from this point forward".  A network driver wants it to mean  
"All pending network packets DMAed out or in and the device shut down  
with all remaining packets queued.  A SATA controller wants it to  
mean "All DMA quiesced and no more commands", etc.


The only way to have that work is to put minimal definitions of what  
state you care about in the drivers themselves.  For Xen this means  
that you need to have an appropriately-timed suspend handler which  
hooks into Xen code very precisely to create and preserve the "No  
pending page table updates" state that you care about.  It will be  
more work in the short term but it's the only maintainable solution  
in the long term IMO.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2007-11-27 Thread Jeremy Fitzhardinge
Rafael J. Wysocki wrote:
> Well, this is more-or-less how we all imagine that should be done eventually.
>
> The main problem is how to implement it without causing too much breakage.
> Also, there are some dirty details that need to be taken into consideration.
>   

For Xen suspend/resume, I'd like to use the freezer to get all threads
into a known consistent state (where, specifically, they don't have any
outstanding pagetable updates pending).  In other words, the freezer as
it currently stands is what I want, modulo some of these issues where it
gets caught up unexpectedly.  If threads end up getting frozen anywhere
preempt isn't explicitly disabled, it wouldn't work for me.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2007-11-27 Thread Rafael J. Wysocki
On Tuesday, 27 of November 2007, Kyle Moffett wrote:
> On Nov 27, 2007, at 12:40:24, Rafael J. Wysocki wrote:
> > On Tuesday, 27 of November 2007, Matthew Garrett wrote:
> >> On Mon, Nov 26, 2007 at 10:53:34PM +0100, Rafael J. Wysocki wrote:
> >>> On Monday, 26 of November 2007, David Chinner wrote:
>  So how do you handle threads that are blocked on I/O or a lock  
>  during the system freeze process, then?
> >>>
> >>> We wait until they can continue.
> >>
> >> So if I have a process blocked on an unavilable NFS mount, I can't
> >> suspend?
> >
> > That's correct, you can't.
> >
> > [And I know what you're going to say. ;-)]
> 
> Why exactly does suspend/hibernation depend on "TASK_INTERRUPTIBLE"  
> instead of a zero preempt_count()?  Really what we should do is just  
> iterate over all of the actual physical devices and tell each one  
> "Block new IO requests preemptably, finish pending DMA, put the  
> hardware in low-power mode, and prepare for suspend/hibernate".  As  
> long as each driver knows how to do those simple things we can have  
> an entirely consistent kernel image for both suspend and for  
> hibernation.

Well, this is more-or-less how we all imagine that should be done eventually.

The main problem is how to implement it without causing too much breakage.
Also, there are some dirty details that need to be taken into consideration.

> When all tasks are preemptable we can very trivially rely on the  
> drivers to enforce the "Stop new IO submission" with a dirt-simple  
> semaphore or waitqueue.  The sleep itself will be  
> TASK_UNINTERRUPTIBLE, but it will be done from a preemptible context.

If there are any drivers that make their devices available via mmap(), that
won't be sufficient.

Probably, we'll need a two iterations over devices to handle all corner cases.

Moreover, for hibernation we need to resume at least some devices in order
to save the image, which shouldn't result in unblocking the waiting tasks.

> That way the system suspend time is the sum of the suspend times of  
> the devices on the system, and the suspend time of any given device  
> is the sum of its maximum non-preemptible critical section and the  
> time to flush all of its remaining pending DMA/etc.  This is almost  
> completely independent of the load-level of the machine, and it does  
> not depend on things like NFS filesystems.  The one gotcha is that it  
> does not flush dirty filesystem pages to disk first, although that  
> could be fixed with a few VFS and blockdev hooks which hierarchically  
> flush and "freeze" block devices and filesystems before actually  
> disabling devices much the way that device-mapper can pause a device  
> to take a snapshot and end up with a clean journal on the filesystem  
> afterwards.

Yes, I generally agree.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2007-11-27 Thread Kyle Moffett

On Nov 27, 2007, at 12:40:24, Rafael J. Wysocki wrote:

On Tuesday, 27 of November 2007, Matthew Garrett wrote:

On Mon, Nov 26, 2007 at 10:53:34PM +0100, Rafael J. Wysocki wrote:

On Monday, 26 of November 2007, David Chinner wrote:
So how do you handle threads that are blocked on I/O or a lock  
during the system freeze process, then?


We wait until they can continue.


So if I have a process blocked on an unavilable NFS mount, I can't
suspend?


That's correct, you can't.

[And I know what you're going to say. ;-)]


Why exactly does suspend/hibernation depend on "TASK_INTERRUPTIBLE"  
instead of a zero preempt_count()?  Really what we should do is just  
iterate over all of the actual physical devices and tell each one  
"Block new IO requests preemptably, finish pending DMA, put the  
hardware in low-power mode, and prepare for suspend/hibernate".  As  
long as each driver knows how to do those simple things we can have  
an entirely consistent kernel image for both suspend and for  
hibernation.


When all tasks are preemptable we can very trivially rely on the  
drivers to enforce the "Stop new IO submission" with a dirt-simple  
semaphore or waitqueue.  The sleep itself will be  
TASK_UNINTERRUPTIBLE, but it will be done from a preemptible context.


That way the system suspend time is the sum of the suspend times of  
the devices on the system, and the suspend time of any given device  
is the sum of its maximum non-preemptible critical section and the  
time to flush all of its remaining pending DMA/etc.  This is almost  
completely independent of the load-level of the machine, and it does  
not depend on things like NFS filesystems.  The one gotcha is that it  
does not flush dirty filesystem pages to disk first, although that  
could be fixed with a few VFS and blockdev hooks which hierarchically  
flush and "freeze" block devices and filesystems before actually  
disabling devices much the way that device-mapper can pause a device  
to take a snapshot and end up with a clean journal on the filesystem  
afterwards.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2007-11-27 Thread Rafael J. Wysocki
On Tuesday, 27 of November 2007, Matthew Garrett wrote:
> On Mon, Nov 26, 2007 at 10:53:34PM +0100, Rafael J. Wysocki wrote:
> > On Monday, 26 of November 2007, David Chinner wrote:
> > > So how do you handle threads that are blocked on I/O or a lock during
> > > the system freeze process, then?
> > 
> > We wait until they can continue.
> 
> So if I have a process blocked on an unavilable NFS mount, I can't 
> suspend?

That's correct, you can't.

[And I know what you're going to say. ;-)]

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2007-11-26 Thread Matthew Garrett
On Mon, Nov 26, 2007 at 10:53:34PM +0100, Rafael J. Wysocki wrote:
> On Monday, 26 of November 2007, David Chinner wrote:
> > So how do you handle threads that are blocked on I/O or a lock during
> > the system freeze process, then?
> 
> We wait until they can continue.

So if I have a process blocked on an unavilable NFS mount, I can't 
suspend?

-- 
Matthew Garrett | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2007-11-26 Thread Rafael J. Wysocki
On Monday, 26 of November 2007, David Chinner wrote:
> On Sat, Nov 24, 2007 at 12:47:21AM +0100, Rafael J. Wysocki wrote:
> > On Thursday, 22 of November 2007, Jeremy Fitzhardinge wrote:
> > > It seems that a process blocked in a write to an xfs filesystem due to
> > > xfs_freeze cannot be frozen by the freezer.
> > 
> > The freezer doesn't handle tasks in TASK_UNINTERRUPTIBLE and I don't know 
> > how
> > to make it handle them without at least partially defeating its purpose.
> 
> So how do you handle threads that are blocked on I/O or a lock during
> the system freeze process, then?

We wait until they can continue.

> > > I see this if I suspend my laptop while doing something xfs-filesystem
> > > intensive, like a kernel build.  My suspend scripts freeze the XFS
> > > filesystem (as Dave said I should), which presumably blocks some writer,
> > > and then the freezer times out and fails to complete.
> > > 
> > > Here's part of the process dump the freezer does when it times out:
> > > 
> > > cc1   D  0 18138  18137
> > >dd5f1e24 00200082 0002  ecdeeb00 ecdeec64 c200f280 
> > > 0001 
> > >009c09a0 dd5f1e0c dd5f1e0c 000f    
> > > dd5f1e74 
> > >c7beb480 dd5f1e88 dd5f1ea8 c0228d97 e8889540 dd5f1e38 c015b75d 
> > > dd5f1e44 
> > > Call Trace:
> > >  [] xfs_write+0xf4/0x6d9
> > >  [] xfs_file_aio_write+0x53/0x5b
> > >  [] do_sync_write+0xae/0xec
> > >  [] vfs_write+0xa4/0x120
> > >  [] sys_write+0x3b/0x60
> > >  [] sysenter_past_esp+0x6b/0xa1
> > >  ===
> > > 
> > > 
> > > I haven't looked at how to fix this yet.  I only just worked out why I
> > > was getting suspend failures.
> > 
> > Well, you can add freezer_do_not_count()/freezer_count() annotations to
> > xfs_write() (and whatever else is blocked as a result of the XFS being 
> > frozen).
> 
> May as well annotate the whole VFS, then, because once the transaction
> subsystem is frozen any operation that modifies the filesystem will get
> blocked like this.

Well, I don't know how this mechanism actually works, so I can't comment.

Is there a mutex on which tasks block if the filesystem is frozen?

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2007-11-26 Thread David Chinner
On Sat, Nov 24, 2007 at 12:47:21AM +0100, Rafael J. Wysocki wrote:
> On Thursday, 22 of November 2007, Jeremy Fitzhardinge wrote:
> > It seems that a process blocked in a write to an xfs filesystem due to
> > xfs_freeze cannot be frozen by the freezer.
> 
> The freezer doesn't handle tasks in TASK_UNINTERRUPTIBLE and I don't know how
> to make it handle them without at least partially defeating its purpose.

So how do you handle threads that are blocked on I/O or a lock during
the system freeze process, then?

> > I see this if I suspend my laptop while doing something xfs-filesystem
> > intensive, like a kernel build.  My suspend scripts freeze the XFS
> > filesystem (as Dave said I should), which presumably blocks some writer,
> > and then the freezer times out and fails to complete.
> > 
> > Here's part of the process dump the freezer does when it times out:
> > 
> > cc1   D  0 18138  18137
> >dd5f1e24 00200082 0002  ecdeeb00 ecdeec64 c200f280 
> > 0001 
> >009c09a0 dd5f1e0c dd5f1e0c 000f    
> > dd5f1e74 
> >c7beb480 dd5f1e88 dd5f1ea8 c0228d97 e8889540 dd5f1e38 c015b75d 
> > dd5f1e44 
> > Call Trace:
> >  [] xfs_write+0xf4/0x6d9
> >  [] xfs_file_aio_write+0x53/0x5b
> >  [] do_sync_write+0xae/0xec
> >  [] vfs_write+0xa4/0x120
> >  [] sys_write+0x3b/0x60
> >  [] sysenter_past_esp+0x6b/0xa1
> >  ===
> > 
> > 
> > I haven't looked at how to fix this yet.  I only just worked out why I
> > was getting suspend failures.
> 
> Well, you can add freezer_do_not_count()/freezer_count() annotations to
> xfs_write() (and whatever else is blocked as a result of the XFS being 
> frozen).

May as well annotate the whole VFS, then, because once the transaction
subsystem is frozen any operation that modifies the filesystem will get
blocked like this.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2007-11-26 Thread Rafael J. Wysocki
On Monday, 26 of November 2007, Jeremy Fitzhardinge wrote:
> Rafael J. Wysocki wrote:
> > On Thursday, 22 of November 2007, Jeremy Fitzhardinge wrote:
> >   
> >> It seems that a process blocked in a write to an xfs filesystem due to
> >> xfs_freeze cannot be frozen by the freezer.
> >> 
> >
> > The freezer doesn't handle tasks in TASK_UNINTERRUPTIBLE and I don't know 
> > how
> > to make it handle them without at least partially defeating its purpose.
> >   
> 
> Well, I guess the question is whether an xfs-frozen writer really needs
> to be UNINTERRUPTIBLE from the freezer's perspective (clearly it does
> from usermode's perspective - filesystem writes just don't return EINTR).
> 
> From a quick poke around, it looks to me like freezing is actually
> implemented in the VFS layer rather than in XFS itself: is that right? 

I don't know the details.

> Could vfs_check_frozen() be changed to something that is freezer-compatible?

That seems doable in principle.  I'll have a closer look at it.

> >> I see this if I suspend my laptop while doing something xfs-filesystem
> >> intensive, like a kernel build.  My suspend scripts freeze the XFS
> >> filesystem (as Dave said I should), which presumably blocks some writer,
> >> and then the freezer times out and fails to complete.
> >>
> >> Here's part of the process dump the freezer does when it times out:
> >>
> >> cc1   D  0 18138  18137
> >>dd5f1e24 00200082 0002  ecdeeb00 ecdeec64 c200f280 
> >> 0001 
> >>009c09a0 dd5f1e0c dd5f1e0c 000f    
> >> dd5f1e74 
> >>c7beb480 dd5f1e88 dd5f1ea8 c0228d97 e8889540 dd5f1e38 c015b75d 
> >> dd5f1e44 
> >> Call Trace:
> >>  [] xfs_write+0xf4/0x6d9
> >>  [] xfs_file_aio_write+0x53/0x5b
> >>  [] do_sync_write+0xae/0xec
> >>  [] vfs_write+0xa4/0x120
> >>  [] sys_write+0x3b/0x60
> >>  [] sysenter_past_esp+0x6b/0xa1
> >>  ===
> >>
> >>
> >> I haven't looked at how to fix this yet.  I only just worked out why I
> >> was getting suspend failures.
> >> 
> >
> > Well, you can add freezer_do_not_count()/freezer_count() annotations to
> > xfs_write() (and whatever else is blocked as a result of the XFS being 
> > frozen).
> >   
> 
> What would be the implications of that?  Would that just prevent
> freezing while there's something blocked there?

The freezer will not wait for this particular task.  Still, the task will have
TIF_FREEZE set, so it will freeze as soon as freezer_count() is reached,
unless the thawing of tasks is carried out first.

If used in the right place, it's reasonably safe, but we need to know what
the right place is.  [That's how we handle vfork(), BTW.]
 
> > Generally, that would be risky without the freezing of XFS, however, 
> > because it
> > might leak us filesystem data to a storage device after creating a 
> > hibernation
> > image which would result in the filesystem corruption after the resume.
> >
> > Still, if you only suspend to RAM, that should be safe.
> >   
> 
> I specifically added it because I was getting data loss due to crashes
> during suspend/resume problems.  It's been pretty stable lately, but I
> may as well remove the xfs_freeze from my suspend scripts if this is the
> solution.

Not exactly. :-)

> I think the broader issue is that there's no reason in principle why
> something blocked due to xfs-freezing (or vfs freezing) should prevent
> the freezer from completing.

Agreed, but the only way to tell the freezer "don't wait for this task", if the
task in question is in TASK_UNINTERRUPTIBLE, is to annotate it.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2007-11-26 Thread Jeremy Fitzhardinge
Rafael J. Wysocki wrote:
> On Thursday, 22 of November 2007, Jeremy Fitzhardinge wrote:
>   
>> It seems that a process blocked in a write to an xfs filesystem due to
>> xfs_freeze cannot be frozen by the freezer.
>> 
>
> The freezer doesn't handle tasks in TASK_UNINTERRUPTIBLE and I don't know how
> to make it handle them without at least partially defeating its purpose.
>   

Well, I guess the question is whether an xfs-frozen writer really needs
to be UNINTERRUPTIBLE from the freezer's perspective (clearly it does
from usermode's perspective - filesystem writes just don't return EINTR).

>From a quick poke around, it looks to me like freezing is actually
implemented in the VFS layer rather than in XFS itself: is that right? 
Could vfs_check_frozen() be changed to something that is freezer-compatible?

>> I see this if I suspend my laptop while doing something xfs-filesystem
>> intensive, like a kernel build.  My suspend scripts freeze the XFS
>> filesystem (as Dave said I should), which presumably blocks some writer,
>> and then the freezer times out and fails to complete.
>>
>> Here's part of the process dump the freezer does when it times out:
>>
>> cc1   D  0 18138  18137
>>dd5f1e24 00200082 0002  ecdeeb00 ecdeec64 c200f280 
>> 0001 
>>009c09a0 dd5f1e0c dd5f1e0c 000f    
>> dd5f1e74 
>>c7beb480 dd5f1e88 dd5f1ea8 c0228d97 e8889540 dd5f1e38 c015b75d 
>> dd5f1e44 
>> Call Trace:
>>  [] xfs_write+0xf4/0x6d9
>>  [] xfs_file_aio_write+0x53/0x5b
>>  [] do_sync_write+0xae/0xec
>>  [] vfs_write+0xa4/0x120
>>  [] sys_write+0x3b/0x60
>>  [] sysenter_past_esp+0x6b/0xa1
>>  ===
>>
>>
>> I haven't looked at how to fix this yet.  I only just worked out why I
>> was getting suspend failures.
>> 
>
> Well, you can add freezer_do_not_count()/freezer_count() annotations to
> xfs_write() (and whatever else is blocked as a result of the XFS being 
> frozen).
>   

What would be the implications of that?  Would that just prevent
freezing while there's something blocked there?

> Generally, that would be risky without the freezing of XFS, however, because 
> it
> might leak us filesystem data to a storage device after creating a hibernation
> image which would result in the filesystem corruption after the resume.
>
> Still, if you only suspend to RAM, that should be safe.
>   

I specifically added it because I was getting data loss due to crashes
during suspend/resume problems.  It's been pretty stable lately, but I
may as well remove the xfs_freeze from my suspend scripts if this is the
solution.

I think the broader issue is that there's no reason in principle why
something blocked due to xfs-freezing (or vfs freezing) should prevent
the freezer from completing.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: freeze vs freezer

2007-11-23 Thread Rafael J. Wysocki
On Thursday, 22 of November 2007, Jeremy Fitzhardinge wrote:
> It seems that a process blocked in a write to an xfs filesystem due to
> xfs_freeze cannot be frozen by the freezer.

The freezer doesn't handle tasks in TASK_UNINTERRUPTIBLE and I don't know how
to make it handle them without at least partially defeating its purpose.

> I see this if I suspend my laptop while doing something xfs-filesystem
> intensive, like a kernel build.  My suspend scripts freeze the XFS
> filesystem (as Dave said I should), which presumably blocks some writer,
> and then the freezer times out and fails to complete.
> 
> Here's part of the process dump the freezer does when it times out:
> 
> cc1   D  0 18138  18137
>dd5f1e24 00200082 0002  ecdeeb00 ecdeec64 c200f280 
> 0001 
>009c09a0 dd5f1e0c dd5f1e0c 000f    
> dd5f1e74 
>c7beb480 dd5f1e88 dd5f1ea8 c0228d97 e8889540 dd5f1e38 c015b75d 
> dd5f1e44 
> Call Trace:
>  [] xfs_write+0xf4/0x6d9
>  [] xfs_file_aio_write+0x53/0x5b
>  [] do_sync_write+0xae/0xec
>  [] vfs_write+0xa4/0x120
>  [] sys_write+0x3b/0x60
>  [] sysenter_past_esp+0x6b/0xa1
>  ===
> 
> 
> I haven't looked at how to fix this yet.  I only just worked out why I
> was getting suspend failures.

Well, you can add freezer_do_not_count()/freezer_count() annotations to
xfs_write() (and whatever else is blocked as a result of the XFS being frozen).

Generally, that would be risky without the freezing of XFS, however, because it
might leak us filesystem data to a storage device after creating a hibernation
image which would result in the filesystem corruption after the resume.

Still, if you only suspend to RAM, that should be safe.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/