Re: very large mount time after unxepected power down

Сергей Александров Tue, 27 Nov 2012 01:58:20 -0800

As I expected, no new info:
[57291.417113] NILFS warning: mounting unchecked fs
[57291.417125] NILFS: nilfs_search_super_root
[57291.417132] NILFS: pseg_start 115603456, seg_seq 4347757
[57291.417137] NILFS: cno 1252241, segnum 56447
[62768.349242] NILFS: recovery complete.
[62768.351804] segctord starting. Construction interval = 5 seconds,
CP frequency < 30 seconds


So, I suppose, nilfs is searching for a super root for a very long
time and find's out that it is consistent.
--------------------------------------------------
Александров Сергей Васильевич


2012/11/27 Сергей Александров <[email protected]>:
> Sorry, I don't have another multi-TB volume.. And this is a 1TB large.
> But I can do split/resync on md to restore broken state.
>
> I'll send full log as soon as possible.
> --------------------------------------------------
> Александров Сергей Васильевич
>
>
> 2012/11/27 Vyacheslav Dubeyko <[email protected]>:
>> On Tue, 2012-11-27 at 10:47 +0300, Сергей Александров wrote:
>>> Hi,
>>>
>>> Yes, I it's not a full output, it's an output for abut 2 minutes.
>>> I can get a full one, but after that the partition apparently will be
>>> checked and it' will be hard to reproduce.
>>> I also can split md device, so that there will be another attempt, should I?
>>
>> If you get enough disk space then it is possible to get full volume
>> image. And, as a result, to get issue reproducible volume state by means
>> of mount volume replica as loop device anytime. Could you do it?
>>
>> I think that we need to get whole output of mount operation anyway for
>> the issue analysis.
>>
>> With the best regards,
>> Vyacheslav Dubeyko.
>>
>>> --------------------------------------------------
>>> Александров Сергей Васильевич
>>>
>>>
>>> 2012/11/27 Vyacheslav Dubeyko <[email protected]>:
>>> > Hi,
>>> >
>>> > On Mon, 2012-11-26 at 22:17 +0300, Сергей Александров wrote:
>>> >> Hi,
>>> >>
>>> >> Unfortunately, the only output I saw is:
>>> >> Nov 26 15:54:00 router kernel: [ 5808.068533] NILFS warning: mounting
>>> >> unchecked fs
>>> >> Nov 26 15:54:00 router kernel: [ 5808.068546] NILFS: 
>>> >> nilfs_search_super_root
>>> >> Nov 26 15:54:00 router kernel: [ 5808.068552] NILFS: pseg_start
>>> >> 115603456, seg_seq 4347757
>>> >> Nov 26 15:54:00 router kernel: [ 5808.068558] NILFS: cno 1252241, segnum 
>>> >> 56447
>>> >>
>>> >
>>> > Sorry, the patch is working. The output about searching super root is
>>> > the result of debug output. But I expected more information in this
>>> > output. Did you break mount operation before ending?
>>> >
>>> > As I understand, you have output not for whole mount operation. Am I
>>> > correct? Could you share whole debug output from the mount beginning to
>>> > the end?
>>> >
>>> > Or do you share all available output that you had during so long mount
>>> > operation?
>>> >
>>> >> I suppose the last three lines are due to the patch.
>>> >>
>>> >> Dynamic debug was on:
>>> >> router dynamic_debug # cat control | grep nilfs
>>> >> fs/nilfs2/recovery.c:851 [nilfs2]nilfs_search_super_root =p "NILFS:
>>> >> cno %lld, segnum %lld\012"
>>> >> fs/nilfs2/recovery.c:850 [nilfs2]nilfs_search_super_root =p "NILFS:
>>> >> pseg_start %lld, seg_seq %lld\012"
>>> >> fs/nilfs2/recovery.c:849 [nilfs2]nilfs_search_super_root =p "NILFS:
>>> >> nilfs_search_super_root\012"
>>> >> fs/nilfs2/recovery.c:766 [nilfs2]nilfs_salvage_orphan_logs =p "NILFS:
>>> >> nilfs_salvage_orphan_logs\012"
>>> >> fs/nilfs2/recovery.c:614 [nilfs2]nilfs_do_roll_forward =p "NILFS:
>>> >> seg_start %lld, seg_end %lld\012"
>>> >> fs/nilfs2/recovery.c:613 [nilfs2]nilfs_do_roll_forward =p "NILFS:
>>> >> segnum %lld\012"
>>> >> fs/nilfs2/recovery.c:612 [nilfs2]nilfs_do_roll_forward =p "NILFS:
>>> >> pseg_start %lld, seg_seq %lld\012"
>>> >> fs/nilfs2/recovery.c:611 [nilfs2]nilfs_do_roll_forward =p "NILFS:
>>> >> nilfs_do_roll_forward\012"
>>> >> fs/nilfs2/recovery.c:442 [nilfs2]nilfs_prepare_segment_for_recovery =p
>>> >> "NILFS: ri_segnum %lld, ri_nextnum %lld\012"
>>> >> fs/nilfs2/recovery.c:440 [nilfs2]nilfs_prepare_segment_for_recovery =p
>>> >> "NILFS: ns_segnum %lld, ns_nextnum %lld\012"
>>> >> fs/nilfs2/recovery.c:438 [nilfs2]nilfs_prepare_segment_for_recovery =p
>>> >> "NILFS: nilfs_prepare_segment_for_recovery\012"
>>> >> fs/nilfs2/recovery.c:719 [nilfs2]nilfs_finish_roll_forward =p "NILFS:
>>> >> nilfs_finish_roll_forward\012"
>>> >
>>> > This is simply list of known debug messages format.
>>> >
>>> > With the best regards,
>>> > Vyacheslav Dubeyko.
>>> >
>>> >> --------------------------------------------------
>>> >> Александров Сергей Васильевич
>>> >>
>>> >>
>>> >> 2012/11/27 Vyacheslav Dubeyko <[email protected]>:
>>> >> > Hi,
>>> >> >
>>> >> > On Nov 26, 2012, at 7:19 PM, Сергей Александров wrote:
>>> >> >
>>> >> >> The bug finally appeared again. Here are the trace log and kernel log.
>>> >> >> I've hard rebooted machine so FS is in unchecked state for now and it
>>> >> >> is possible to make some other tests.
>>> >> >> I'll manage to do without this partition for a day may be two.
>>> >> >>
>>> >> >
>>> >> > Thank you for logs. But what about my patch with debug output? The 
>>> >> > output of the patch can be very helpful for the issue analysis. It 
>>> >> > needs to understand what segments are processed during mount. Could 
>>> >> > you share debug output of my patch?
>>> >> >
>>> >> > With the best regards,
>>> >> > Vyacheslav Dubeyko.
>>> >> >
>>> >> >>
>>> >> >> --------------------------------------------------
>>> >> >> Александров Сергей Васильевич
>>> >> >>
>>> >> >>
>>> >> >> 2012/11/16 Сергей Александров <[email protected]>:
>>> >> >>> Ok, thanks. I'll try to apply it. I've also turned on function graph
>>> >> >>> tracing, may be the graph for init_nilfs function will help.
>>> >> >>> --------------------------------------------------
>>> >> >>> Александров Сергей Васильевич
>>> >> >>>
>>> >> >>>
>>> >> >>> 2012/11/16 Vyacheslav Dubeyko <[email protected]>:
>>> >> >>>> On Fri, 2012-11-16 at 10:11 +0300, Сергей Александров wrote:
>>> >> >>>>> dmesg:
>>> >> >>>>> [53994.254432] NILFS warning: mounting unchecked fs
>>> >> >>>>> [56686.968229] NILFS: recovery complete.
>>> >> >>>>> [56686.969316] segctord starting. Construction interval = 5 
>>> >> >>>>> seconds,
>>> >> >>>>> CP frequency < 30 seconds
>>> >> >>>>>
>>> >> >>>>> messages:
>>> >> >>>>> Nov 15 10:57:06 router kernel: [53994.254432] NILFS warning: 
>>> >> >>>>> mounting
>>> >> >>>>> unchecked fs
>>> >> >>>>> Nov 15 11:42:02 router kernel: [56686.968229] NILFS: recovery 
>>> >> >>>>> complete.
>>> >> >>>>> Nov 15 11:42:02 router kernel: [56686.969316] segctord starting.
>>> >> >>>>> Construction interval = 5 seconds, CP frequency < 30 seconds
>>> >> >>>>>
>>> >> >>>>> May be there is some kernel config option to get more debug output?
>>> >> >>>>>
>>> >> >>>>
>>> >> >>>> I prepared small patch (please, find in attachment). This patch 
>>> >> >>>> simply
>>> >> >>>> adds several debug messages into recovery module. I suggest to 
>>> >> >>>> apply the
>>> >> >>>> patch on your NILFS2 driver and try to mount again in the issue
>>> >> >>>> environment. I hope that these debug messages can give more 
>>> >> >>>> information
>>> >> >>>> about issue on your side.
>>> >> >>>>
>>> >> >>>> The patch uses pr_debug() call. Please, see
>>> >> >>>> Documentation/dynamic-debug-howto.txt for more details.
>>> >> >>>>
>>> >> >>>> To be honest, I don't test the patch yet. So, if you will have any
>>> >> >>>> troubles, please, e-mail to me.
>>> >> >>>>
>>> >> >>>> With the best regards,
>>> >> >>>> Vyacheslav Dubeyko.
>>> >> >>>>
>>> >> >>>>> As for fsck, I have not found it in git public repo, so where can I
>>> >> >>>>> get the latest version?
>>> >> >>>>> --------------------------------------------------
>>> >> >>>>
>>> >> >> <kern.log><trace_fail>
>>> >> >
>>> >
>>> >
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
>>> the body of a message to [email protected]
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: very large mount time after unxepected power down

Reply via email to