On 08/02/2012 07:18 PM, Arne Jansen wrote: > On 02.08.2012 12:36, Liu Bo wrote: >> On 08/02/2012 06:30 PM, Stefan Behrens wrote: >>> On Wed, 01 Aug 2012 16:31:54 +0200, Stefan Behrens wrote: >>>> On Wed, 01 Aug 2012 21:31:58 +0800, Liu Bo wrote: >>>>> On 08/01/2012 09:07 PM, Jan Schmidt wrote: >>>>>> On Wed, August 01, 2012 at 14:02 (+0200), Liu Bo wrote: >>>>>>> On 08/01/2012 07:45 PM, Stefan Behrens wrote: >>>>>>>> With commit acce952b0, btrfs was changed to flag the filesystem with >>>>>>>> BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal >>>>>>>> error happened like a write I/O errors of all mirrors. >>>>>>>> In such situations, on unmount, the superblock is written in >>>>>>>> btrfs_error_commit_super(). This is done with the intention to be able >>>>>>>> to evaluate the error flag on the next mount. A warning is printed >>>>>>>> in this case during the next mount and the log tree is ignored. >>>>>>>> >>>>>>>> The issue is that it is possible that the superblock points to a root >>>>>>>> that was not written (due to write I/O errors). >>>>>>>> The result is that the filesystem cannot be mounted. btrfsck also does >>>>>>>> not start and all the other btrfs-progs tools fail to start as well. >>>>>>>> However, mount -o recovery is working well and does the right things >>>>>>>> to recover the filesystem (i.e., don't use the log root, clear the >>>>>>>> free space cache and use the next mountable root that is stored in the >>>>>>>> root backup array). >>>>>>>> >>>>>>>> This patch removes the writing of the superblock when >>>>>>>> BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error >>>>>>>> flag in the mount function. >>>>>>>> >>>>>>> >>>>>>> Yes, I have to admit that this can be a serious problem. >>>>>>> >>>>>>> But we'll need to send the error flag stored in the super block into >>>>>>> disk in the future so that the next mount can find it unstable and do >>>>>>> fsck by itself maybe. >>>>>> >>>>>> Hum, that's possible. However, I neither see >>>>>> >>>>>> a) a safe way to get that flag to disk >>>>>> >>>>>> nor >>>>>> >>>>>> b) a situation where this flag would help. When we abort a transaction, >>>>>> we just >>>>>> roll everything back to the last commit, i.e. a consistent state. So if >>>>>> we stop >>>>>> writing a potentially corrupt super block, we should be fine anyway. Or >>>>>> am I >>>>>> missing something? >>>>>> >>>>> >>>>> I'm just wondering if we can roll everything back well, why do we need >>>>> fsck? >>>> >>>> If the disks support barriers, we roll everything back very well. The >>>> most recent superblock on the disks always defines a consistent >>>> filesystem state. There are only two remaining filesystem consistency >>>> issues left that can cause inconsistent states, one is the one that the >>>> patch in this email addresses, and the second one is that the error >>>> result from barrier_all_devices() is ignored (which I want to change next). >>> >>> Hi Liu Bo, >>> >>> Do you have any remaining objections to that patch? >>> >> >> Hi Stefan, >> >> Still I have another question: >> >> Our metadata can be flushed into disk if we reach the limit, 32k, so we >> can end up with updated metadata and the latest superblock if we do not >> write the current super block. > > The old metadata stays valid until the new superblock is written, > so no problem here, or maybe I don't understand your question :) >
Yeah, Arne, you're right :) But for undetected and unexpected errors as Arne had mentioned, I want to keep the error flag which is able to inform users that this FS is recommended (but not must) to do fsck at least. thanks, liubo >> >> Any ideas? >> >> thanks, >> liubo >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html