subject:"\[Cluster\-devel\] \[GFS2 PATCH\] gfs2\: Panic when an io error occurs writing"

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing

2018-12-17 Thread Mark Syms

On Mon, Dec 17, 2018 at 09:58:47AM -0500, Bob Peterson wrote:
> Dave Teigland recommended. Unless I'm mistaken, Dave has said that 
> GFS2 should never withdraw; it should always just kernel panic (Dave, 
> correct me if I'm wrong). At least this patch confines that behavior 
> to a small subset of withdraws.

The basic idea is that you want to get a malfunctioning node out of the way as 
quickly as possible so others can recover and carry on.  Escalating a partial 
failure into a total node failure is the best way to do that in this case.  
Specialized recovery paths run from a partially failed node won't be as 
reliable, and are prone to blocking all the nodes.

I think a reasonable alternative to this is to just sit in an infinite retry 
loop until the i/o succeeds.

Dave
[Mark Syms] I would hope that this code would only trigger after some effort 
has been put into  retrying as panicing the host on the first I/O failure seems 
like a sure fire way to get unhappy users (and in our case paying customers). 
As Edvin points out there may be other filesystems that may be able to cleanly 
unmount and thus avoid having to check everything on restart.

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing

2018-12-17 Thread Bob Peterson

- Original Message -
> I think a reasonable alternative to this is to just sit in an infinite retry
> loop until the i/o succeeds.
> 
> Dave
> [Mark Syms] I would hope that this code would only trigger after some effort
> has been put into  retrying as panicing the host on the first I/O failure
> seems like a sure fire way to get unhappy users (and in our case paying
> customers). As Edvin points out there may be other filesystems that may be
> able to cleanly unmount and thus avoid having to check everything on
> restart.

Hi Mark,

Perhaps. I'm not block layer or iscsi expert, but afaik, it's not the file
system's job to retry IO, and never has been, right?

There are already iscsi tuning parameters, vfs tuning parameters, etc. So
if an IO error is sent to GFS2 for a write operation, it means the retry
algorithms and operation timeout algorithms built into the layers below us
(the iscsi layer, scsi layer, block layer, tcp/ip layer etc.) have all failed
and given up on the IO operation. We can't really justify adding yet another
layer of retries on top of all that, can we?

I see your point, and perhaps the system should stay up to continue other
mission-critical operations that may not require the faulted hardware.
But what's a viable alternative? As Dave T. suggested, we can keep resubmitting
the IO until it completes, but then the journal never gets replayed and nobody
can have those locks ever again, and that would cause a potential hang of the
entire cluster, especially in cases where there's only one device that's failed
and the whole cluster is using it.

In GFS2, we've got a concept of marking a resource group "in error" so the
other nodes won't try to use it, but the same corruption that affects resource
groups could be extrapolated to "hot" dinodes as well. For example, suppose the
root (mount-point) dinode was in the journal. Now the whole cluster is hung
rather than just the one node.

Regards,

Bob Peterson
Red Hat File Systems

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing

2018-12-18 Thread Mark Syms

Hi Bob,

I agree, it's a hard problem. I'm just trying to understand that we've done the 
absolute best we can and that if this condition is hit then the best solution 
really is to just kill the node. I guess it's also a question of how common 
this actually ends up being. We have now got customers starting to use GFS2 for 
VM storage on XenServer so I guess we'll just have to see how many support 
calls we get in on it.

Thanks,

Mark.

-Original Message-
From: Bob Peterson  
Sent: 17 December 2018 20:20
To: Mark Syms 
Cc: cluster-devel@redhat.com
Subject: Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs 
writing

- Original Message -
> I think a reasonable alternative to this is to just sit in an infinite 
> retry loop until the i/o succeeds.
> 
> Dave
> [Mark Syms] I would hope that this code would only trigger after some 
> effort has been put into  retrying as panicing the host on the first 
> I/O failure seems like a sure fire way to get unhappy users (and in 
> our case paying customers). As Edvin points out there may be other 
> filesystems that may be able to cleanly unmount and thus avoid having 
> to check everything on restart.

Hi Mark,

Perhaps. I'm not block layer or iscsi expert, but afaik, it's not the file 
system's job to retry IO, and never has been, right?

There are already iscsi tuning parameters, vfs tuning parameters, etc. So if an 
IO error is sent to GFS2 for a write operation, it means the retry algorithms 
and operation timeout algorithms built into the layers below us (the iscsi 
layer, scsi layer, block layer, tcp/ip layer etc.) have all failed and given up 
on the IO operation. We can't really justify adding yet another layer of 
retries on top of all that, can we?

I see your point, and perhaps the system should stay up to continue other 
mission-critical operations that may not require the faulted hardware.
But what's a viable alternative? As Dave T. suggested, we can keep resubmitting 
the IO until it completes, but then the journal never gets replayed and nobody 
can have those locks ever again, and that would cause a potential hang of the 
entire cluster, especially in cases where there's only one device that's failed 
and the whole cluster is using it.

In GFS2, we've got a concept of marking a resource group "in error" so the 
other nodes won't try to use it, but the same corruption that affects resource 
groups could be extrapolated to "hot" dinodes as well. For example, suppose the 
root (mount-point) dinode was in the journal. Now the whole cluster is hung 
rather than just the one node.

Regards,

Bob Peterson
Red Hat File Systems

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing

2018-12-18 Thread Bob Peterson

- Original Message -
> Hi Bob,
> 
> I agree, it's a hard problem. I'm just trying to understand that we've done
> the absolute best we can and that if this condition is hit then the best
> solution really is to just kill the node. I guess it's also a question of
> how common this actually ends up being. We have now got customers starting
> to use GFS2 for VM storage on XenServer so I guess we'll just have to see
> how many support calls we get in on it.
> 
> Thanks,
> 
> Mark.

Hi Mark,

I don't expect the problem to be very common in the real world. 
The user has to get IO errors while writing to the GFS2 journal, which is
not very common. The patch is basically reacting to a phenomenon we
recently started noticing in which the HBA (qla2xxx) driver shuts down
and stops accepting requests when you do abnormal reboots (which we sometimes
do to test node recovery). In these cases, the node doesn't go down right away.
It stays up just long enough to cause IO errors with subsequent withdraws,
which, we discovered, results in file system corruption.
Normal reboots, "/sbin/reboot -fin", and "echo b > /proc/sysrq-trigger" should
not have this problem, nor should node fencing, etc.

And like I said, I'm open to suggestions on how to fix it. I wish there was a
better solution.

As it is, I'd kind of like to get something into this merge window for the
upstream kernel, but I'll need to submit the pull request for that probably
tomorrow or Thursday. If we find a better solution, we can always revert these
changes and implement a new one.

Regards,

Bob Peterson

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing

2018-12-18 Thread Mark Syms

Thanks Bob,

We believe we have seen these issues from time to time in our automated testing 
but I suspect that they're indicating a configuration problem with the backing 
storage. For flexibility a proportion of our purely functional testing will use 
storage provided by a VM running a software iSCSI target and these tests seem 
to be somewhat susceptible to getting I/O errors, some of which will inevitably 
end up being in the journal. If we start to see a lot we'll need to look at the 
config of the VMs first I think.

Mark.

-Original Message-
From: Bob Peterson  
Sent: 18 December 2018 15:52
To: Mark Syms 
Cc: cluster-devel@redhat.com
Subject: Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs 
writing

- Original Message -
> Hi Bob,
> 
> I agree, it's a hard problem. I'm just trying to understand that we've 
> done the absolute best we can and that if this condition is hit then 
> the best solution really is to just kill the node. I guess it's also a 
> question of how common this actually ends up being. We have now got 
> customers starting to use GFS2 for VM storage on XenServer so I guess 
> we'll just have to see how many support calls we get in on it.
> 
> Thanks,
> 
> Mark.

Hi Mark,

I don't expect the problem to be very common in the real world. 
The user has to get IO errors while writing to the GFS2 journal, which is not 
very common. The patch is basically reacting to a phenomenon we recently 
started noticing in which the HBA (qla2xxx) driver shuts down and stops 
accepting requests when you do abnormal reboots (which we sometimes do to test 
node recovery). In these cases, the node doesn't go down right away.
It stays up just long enough to cause IO errors with subsequent withdraws, 
which, we discovered, results in file system corruption.
Normal reboots, "/sbin/reboot -fin", and "echo b > /proc/sysrq-trigger" should 
not have this problem, nor should node fencing, etc.

And like I said, I'm open to suggestions on how to fix it. I wish there was a 
better solution.

As it is, I'd kind of like to get something into this merge window for the 
upstream kernel, but I'll need to submit the pull request for that probably 
tomorrow or Thursday. If we find a better solution, we can always revert these 
changes and implement a new one.

Regards,

Bob Peterson

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing

2018-12-19 Thread Steven Whitehouse


Hi,

On 18/12/2018 16:09, Mark Syms wrote:

Thanks Bob,

We believe we have seen these issues from time to time in our automated testing 
but I suspect that they're indicating a configuration problem with the backing 
storage. For flexibility a proportion of our purely functional testing will use 
storage provided by a VM running a software iSCSI target and these tests seem 
to be somewhat susceptible to getting I/O errors, some of which will inevitably 
end up being in the journal. If we start to see a lot we'll need to look at the 
config of the VMs first I think.

Mark.


I think there are a few things here... firstly Bob is right that in 
general if we are going to retry I/O, then this would be done at the 
block layer, by multipath for example. However, having a way to 
gracefully deal with failure aside from fencing/rebooting a node is useful.


One issue with that is tracking outstanding I/O. For the journal we do 
that anyway, since we count the number of in flight I/Os. In other cases 
this is more difficult, for example where we use the VFS library 
functions for readpages/writepages. If we were able to track all the I/O 
that GFS2 produces and be certain to be able to turn off future I/O (or 
writes at least) internally then we could avoid using the dm based 
solution for withdraw that we currently have. That would be an 
improvement in terms of reliability.


The other issue is the one that Bob has been looking at, namely a way to 
signal that recovery is due, but without requiring fencing. If we can 
solve both of those issues, then that would certainly go a long way 
towards improving this,


Steve.




-Original Message-
From: Bob Peterson 
Sent: 18 December 2018 15:52
To: Mark Syms 
Cc: cluster-devel@redhat.com
Subject: Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs 
writing

- Original Message -

Hi Bob,

I agree, it's a hard problem. I'm just trying to understand that we've
done the absolute best we can and that if this condition is hit then
the best solution really is to just kill the node. I guess it's also a
question of how common this actually ends up being. We have now got
customers starting to use GFS2 for VM storage on XenServer so I guess
we'll just have to see how many support calls we get in on it.

Thanks,

Mark.

Hi Mark,

I don't expect the problem to be very common in the real world.
The user has to get IO errors while writing to the GFS2 journal, which is not 
very common. The patch is basically reacting to a phenomenon we recently 
started noticing in which the HBA (qla2xxx) driver shuts down and stops 
accepting requests when you do abnormal reboots (which we sometimes do to test 
node recovery). In these cases, the node doesn't go down right away.
It stays up just long enough to cause IO errors with subsequent withdraws, 
which, we discovered, results in file system corruption.
Normal reboots, "/sbin/reboot -fin", and "echo b > /proc/sysrq-trigger" should 
not have this problem, nor should node fencing, etc.

And like I said, I'm open to suggestions on how to fix it. I wish there was a 
better solution.

As it is, I'd kind of like to get something into this merge window for the 
upstream kernel, but I'll need to submit the pull request for that probably 
tomorrow or Thursday. If we find a better solution, we can always revert these 
changes and implement a new one.

Regards,

Bob Peterson

[Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing to the journal

2018-12-17 Thread Bob Peterson

Hi,

Before this patch, gfs2 would try to withdraw when it encountered
io errors writing to its journal. That's incorrect behavior
because if it can't write to the journal, it cannot write revokes
for the metadata it sends down. A withdraw will cause gfs2 to
unmount the file system from dlm, which is a controlled shutdown,
but the io error means it cannot write the UNMOUNT log header
to the journal. The controlled shutdown will cause dlm to release
all its locks, allowing other nodes to update the metadata.
When the node rejoins the cluster and sees no UNMOUNT log header
it will see the journal is dirty and replay it, but after the
other nodes may have changed the metadata, thus corrupting the
file system.

If we get an io error writing to the journal, the only correct
thing to do is to kernel panic. That will force dlm to go through
its full recovery process on the other cluster nodes, freeze all
locks, and make sure the journal is replayed by a node in the
cluster before any other nodes get the affected locks and try to
modify the metadata in the unfinished portion of the journal.

This patch changes the behavior so that io errors encountered
in the journals cause an immediate kernel panic with a message.
However, quota update errors are still allowed to withdraw as
before.

Signed-off-by: Bob Peterson 
---
 fs/gfs2/lops.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 94dcab655bc0..44b85f7675d4 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -209,11 +209,9 @@ static void gfs2_end_log_write(struct bio *bio)
struct page *page;
int i;
 
-   if (bio->bi_status) {
-   fs_err(sdp, "Error %d writing to journal, jid=%u\n",
-  bio->bi_status, sdp->sd_jdesc->jd_jid);
-   wake_up(&sdp->sd_logd_waitq);
-   }
+   if (bio->bi_status)
+   panic("Error %d writing to journal, jid=%u\n", bio->bi_status,
+ sdp->sd_jdesc->jd_jid);
 
bio_for_each_segment_all(bvec, bio, i) {
page = bvec->bv_page;

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing to the journal

2018-12-17 Thread Edwin Török

On 17/12/2018 13:54, Bob Peterson wrote:
> Hi,
> 
> Before this patch, gfs2 would try to withdraw when it encountered
> io errors writing to its journal. That's incorrect behavior
> because if it can't write to the journal, it cannot write revokes
> for the metadata it sends down. A withdraw will cause gfs2 to
> unmount the file system from dlm, which is a controlled shutdown,
> but the io error means it cannot write the UNMOUNT log header
> to the journal. The controlled shutdown will cause dlm to release
> all its locks, allowing other nodes to update the metadata.
> When the node rejoins the cluster and sees no UNMOUNT log header
> it will see the journal is dirty and replay it, but after the
> other nodes may have changed the metadata, thus corrupting the
> file system.
> 
> If we get an io error writing to the journal, the only correct
> thing to do is to kernel panic. 

Hi,

That may be required for correctness, however are we sure there is no
other way to force the DLM recovery (or can another mechanism be
introduced)?
Consider that there might be multiple GFS2 filesystems mounted from
different iSCSI backends, just because one of them encountered an I/O
error the other ones may still be good to continue.
(Also the host might have other filesystems mounted: local, NFS, it
might still be able to perform I/O on those, so bringing the whole host
down would be best avoided).

Best regards,
--Edwin

> That will force dlm to go through
> its full recovery process on the other cluster nodes, freeze all
> locks, and make sure the journal is replayed by a node in the
> cluster before any other nodes get the affected locks and try to
> modify the metadata in the unfinished portion of the journal.
> 
> This patch changes the behavior so that io errors encountered
> in the journals cause an immediate kernel panic with a message.
> However, quota update errors are still allowed to withdraw as
> before.
> 
> Signed-off-by: Bob Peterson 
> ---
>  fs/gfs2/lops.c | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
> index 94dcab655bc0..44b85f7675d4 100644
> --- a/fs/gfs2/lops.c
> +++ b/fs/gfs2/lops.c
> @@ -209,11 +209,9 @@ static void gfs2_end_log_write(struct bio *bio)
>   struct page *page;
>   int i;
>  
> - if (bio->bi_status) {
> - fs_err(sdp, "Error %d writing to journal, jid=%u\n",
> -bio->bi_status, sdp->sd_jdesc->jd_jid);
> - wake_up(&sdp->sd_logd_waitq);
> - }
> + if (bio->bi_status)
> + panic("Error %d writing to journal, jid=%u\n", bio->bi_status,
> +   sdp->sd_jdesc->jd_jid);
>  
>   bio_for_each_segment_all(bvec, bio, i) {
>   page = bvec->bv_page;
>

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing to the journal

2018-12-17 Thread Steven Whitehouse


Hi,

On 17/12/2018 09:04, Edwin Török wrote:

On 17/12/2018 13:54, Bob Peterson wrote:

Hi,

Before this patch, gfs2 would try to withdraw when it encountered
io errors writing to its journal. That's incorrect behavior
because if it can't write to the journal, it cannot write revokes
for the metadata it sends down. A withdraw will cause gfs2 to
unmount the file system from dlm, which is a controlled shutdown,
but the io error means it cannot write the UNMOUNT log header
to the journal. The controlled shutdown will cause dlm to release
all its locks, allowing other nodes to update the metadata.
When the node rejoins the cluster and sees no UNMOUNT log header
it will see the journal is dirty and replay it, but after the
other nodes may have changed the metadata, thus corrupting the
file system.

If we get an io error writing to the journal, the only correct
thing to do is to kernel panic.

Hi,

That may be required for correctness, however are we sure there is no
other way to force the DLM recovery (or can another mechanism be
introduced)?
Consider that there might be multiple GFS2 filesystems mounted from
different iSCSI backends, just because one of them encountered an I/O
error the other ones may still be good to continue.
(Also the host might have other filesystems mounted: local, NFS, it
might still be able to perform I/O on those, so bringing the whole host
down would be best avoided).

Best regards,
--Edwin


Indeed. I think the issue here is that we need to ensure that the other 
cluster nodes understand what has happened. At the moment the mechanism 
for that is that the node is fenced, so panicing, while it is not ideal 
does at least mean that will definitely happen.


I agree though that we want something better longer term,

Steve.


That will force dlm to go through
its full recovery process on the other cluster nodes, freeze all
locks, and make sure the journal is replayed by a node in the
cluster before any other nodes get the affected locks and try to
modify the metadata in the unfinished portion of the journal.

This patch changes the behavior so that io errors encountered
in the journals cause an immediate kernel panic with a message.
However, quota update errors are still allowed to withdraw as
before.

Signed-off-by: Bob Peterson 
---
  fs/gfs2/lops.c | 8 +++-
  1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index 94dcab655bc0..44b85f7675d4 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -209,11 +209,9 @@ static void gfs2_end_log_write(struct bio *bio)
struct page *page;
int i;
  
-	if (bio->bi_status) {

-   fs_err(sdp, "Error %d writing to journal, jid=%u\n",
-  bio->bi_status, sdp->sd_jdesc->jd_jid);
-   wake_up(&sdp->sd_logd_waitq);
-   }
+   if (bio->bi_status)
+   panic("Error %d writing to journal, jid=%u\n", bio->bi_status,
+ sdp->sd_jdesc->jd_jid);
  
  	bio_for_each_segment_all(bvec, bio, i) {

page = bvec->bv_page;

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing to the journal

2018-12-17 Thread Bob Peterson

Hi,

- Original Message -
> On 17/12/2018 09:04, Edwin Török wrote:
> >> If we get an io error writing to the journal, the only correct
> >> thing to do is to kernel panic.
> > Hi,
> >
> > That may be required for correctness, however are we sure there is no
> > other way to force the DLM recovery (or can another mechanism be
> > introduced)?
> > Consider that there might be multiple GFS2 filesystems mounted from
> > different iSCSI backends, just because one of them encountered an I/O
> > error the other ones may still be good to continue.
> > (Also the host might have other filesystems mounted: local, NFS, it
> > might still be able to perform I/O on those, so bringing the whole host
> > down would be best avoided).
> >
> > Best regards,
> > --Edwin
> 
> Indeed. I think the issue here is that we need to ensure that the other
> cluster nodes understand what has happened. At the moment the mechanism
> for that is that the node is fenced, so panicing, while it is not ideal
> does at least mean that will definitely happen.
> 
> I agree though that we want something better longer term,
> 
> Steve.

The important thing is to guarantee that the journal is replayed by
a node (other than the node that had the IO error writing to its journal)
before any other node is allowed to acquire any of the locks held by the
node with the journal IO error. Before this patch, I had two others:

(1) The first made GFS2 perform journal recovery on a different node
whenever a withdraw is done. This is a bit tricky, since it needs
to communicate which journal needs replaying (or alternately, try to
acquire and replay them all), and it needs to happen before DLM can
hand the locks to another node. I tried to figure out a good way to
hook this into DLM's or lock_dlm's recovery path, but I couldn't find
an acceptable way to do it. In the DLM case, the recovery is all driven
from the top (user-space / dlm_controld / corosync / etc.) down and
I couldn't find a good place to do this without getting DLM out of
sync with its user-space counterparts.

So I created new functions as part of lock_dlm's recovery path
(bit that were formerly in user space, as part of gfs_controld).
I used lvbs to communicate the IDs of all journals needing recovery
and since DLM only updates lvb information on convert operations,
I needed to demote / promote a universally known lock to do it
(I used gfs2's "Live" glock for this purpose.)

Doing all these demotes and promotes is complicated and Andreas did
not like it at all, but I couldn't think of a better way. I could code
it so that the node attempts recovery on all journals, and it would
just fail its "try locks" with the other journals that are in use,
but it would result in a lot of dmesg noise, and possibly several
nodes replaying the same journal one after another (depending on
the timing of the locks), plus all this recovery risks corosync
being further starved for CPU and fencing nodes.

Given my discussions with Dave Teigland (upstream dlm maintainer), we
may still want (or need) this for all GFS2 withdraw situations.

(2) The second patch detected the journal IO error and simply refused
to inform DLM that it had unlocked any and all of its locks since
the IO error occurred. That accomplished the job, but predictably,
it caused the glocks to get out of sync with the dlm locks, which
eventually resulted in a BUG() with kernel panic anyway.

I suppose we could add special exceptions so it doesn't panic when
the file system is withdrawn. It also resulted in the other nodes
hanging indefinitely until the failed node was fenced and rebooted,
as soon as they tried to acquire the rgrp glocks needed to do their
IO, until the journal recovery was done.

We also might be able to handle this and set some kind of status
before it tries to release the dlm locks to avoid the BUG(),
but the withdrawing node wouldn't be able to unmount (unless we
kludged it even more to free a locked glock or something).
Anything we do is bound to be an ugly hack.

I suppose if a node was working exclusively in different file systems
they wouldn't hang, and maybe that's better behavior. Or maybe not.

Believe me, I thought long and hard about how to better accomplish this,
but never found a better (or simpler) way. A kernel panic is also what
Dave Teigland recommended. Unless I'm mistaken, Dave has said that GFS2
should never withdraw; it should always just kernel panic (Dave, correct
me if I'm wrong). At least this patch confines that behavior to a small
subset of withdraws.

I'm definitely open to ideas on how to better fix this, but I'm out of ideas. 
Just because I'm out of ideas doesn't mean there isn't a good way to do it.
Feel free to make suggestions if you can think of a better way to handle
this situation.

Regards,

Bob Peterson
Red Hat File Systems

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing to the journal

2018-12-17 Thread David Teigland

On Mon, Dec 17, 2018 at 09:58:47AM -0500, Bob Peterson wrote:
> Dave Teigland recommended. Unless I'm mistaken, Dave has said that GFS2
> should never withdraw; it should always just kernel panic (Dave, correct
> me if I'm wrong). At least this patch confines that behavior to a small
> subset of withdraws.

The basic idea is that you want to get a malfunctioning node out of the
way as quickly as possible so others can recover and carry on.  Escalating
a partial failure into a total node failure is the best way to do that in
this case.  Specialized recovery paths run from a partially failed node
won't be as reliable, and are prone to blocking all the nodes.

I think a reasonable alternative to this is to just sit in an infinite
retry loop until the i/o succeeds.

Dave

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing

[Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing to the journal

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing to the journal

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing to the journal

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing to the journal

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Panic when an io error occurs writing to the journal

11 matches

Site Navigation

Mail list logo

Footer information