I managed to get the debug option to compile but only after I commented out the 
lines in debug.c

/*#if (LINUX_VERSION_CODE > KERNEL_VERSION(2, 6, 22))
        TEST_PAGE_FLAG(page, Pinned, b, size, n, len);
        TEST_PAGE_FLAG(page, Readahead, b, size, n, len);
#endif
/*

Is there perhaps something different about 2.6.27?

Do I know need to update the mount options with a 'v' ?

Thanks
David


-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of 
Barham, David
Sent: 06 October 2009 15:24
To: [email protected]
Subject: Re: [NILFS users] NILFS hanging SLES 11 - advise on diagnosis needed

I checked the hardware today as the machine keeps falling over, but only as I'm 
doing data copies into the nilfs2 filesystem via NFS. The last few times I've 
not had any errors in the logs at all except those the clearerd which is in 
debug mode and the startup after I power cycle the machine.

I tried to build a debug version of 2.0.17 but I'm getting 

/usr/local/sources/nilfs-2.0.17 # make
make -C fs
make[1]: Entering directory `/usr/local/sources/nilfs-2.0.17/fs'
make -C /lib/modules/2.6.27.19-5-default/build 
SUBDIRS=/usr/local/sources/nilfs-2.0.17/fs 
BUILD_DIR=/usr/local/sources/nilfs-2.0.17/fs modules
make[2]: Entering directory `/usr/src/linux-2.6.27.19-5-obj/x86_64/default'
make -C ../../../linux-2.6.27.19-5 
O=/usr/src/linux-2.6.27.19-5-obj/x86_64/default/. modules
  CC [M]  /usr/local/sources/nilfs-2.0.17/fs/debug.o
/usr/local/sources/nilfs-2.0.17/fs/debug.c: In function âsnprint_page_flagsâ:
/usr/local/sources/nilfs-2.0.17/fs/debug.c:408: error: implicit declaration of 
function âPagePinnedâ
make[5]: *** [/usr/local/sources/nilfs-2.0.17/fs/debug.o] Error 1
make[4]: *** [_module_/usr/local/sources/nilfs-2.0.17/fs] Error 2
make[3]: *** [sub-make] Error 2
make[2]: *** [all] Error 2
make[2]: Leaving directory `/usr/src/linux-2.6.27.19-5-obj/x86_64/default'
make[1]: *** [default] Error 2
make[1]: Leaving directory `/usr/local/sources/nilfs-2.0.17/fs'
make: *** [fs] Error 2

I get the same error trying to make 2.0.16.

I've had a look at the C but I'm out of my depth.

Any help would be great.

Thanks
David Barham
Siemens PLM Software


-----Original Message-----
From: Ryusuke Konishi [mailto:[email protected]] 
Sent: 02 October 2009 12:42
To:; Barham, David
Subject: Re: [NILFS users] NILFS hanging SLES 11 - advise on diagnosis needed

Hi,
On Fri, 2 Oct 2009 12:11:14 +0200, "Barham, David" wrote:
> Hi
> I'm running SLES 11, 2.6.27.19-5-default with NILFS2 nilfs-2.0.16. I have a 
> 1.5Tb NILFS2 partition which I am setting up with the intention of using 
> Robocopy from various PCs via samba. The robocopy scripts run nightly and a 
> checkpoint is taken once night. A script stops samba, unmounts the previous 
> weeks checkpoint, deletes the checkpoint, creates a new one and then mounts 
> it and restarts samba. This should mean that at any time the user can go back 
> to 'snapshot_{DAY}' to get their files back.
> 
> So far so good.
> 
> However as I copy the previously backed up files from the previous linux 
> machine where I was doing this (only giving a 'current' copy with reiserfs). 
> I'm finding that the new machine is occasionally hanging. The OS just locks 
> up, screen on console frozen but host still responds to ping. 
> 
> I'm trying to work out what is causing the hang, I'm getting various messages 
> in the log from smartd relating to the disk which houses the NILFS along the 
> lines of:
> 
>  Oct  2 09:56:59 cpli6008 syslog-ng[1933]: Log statistics; 
> dropped='pipe(/dev/xconsole)=0', dropped='pipe(/dev/tty10)=0', 
> processed='center(queued)=947', processed='center(received)=478', 
> processed='destination(newsnotice)=0', processed='destination(acpid)=0', 
> processed='destination(firewall)=0', processed='destination(mail)=12', 
> processed='destination(mailinfo)=12', processed='destination(console)=151', 
> processed='destination(newserr)=0', processed='destination(newscrit)=0', 
> processed='destination(messages)=466', processed='destination(mailwarn)=0', 
> processed='destination(localmessages)=0', processed='destination(netmgm)=0', 
> processed='destination(mailerr)=0', processed='destination(xconsole)=151', 
> processed='destination(warn)=155', processed='source(src)=478'
> Oct  2 09:57:25 cpli6008 smartd[3473]: Device: /dev/sda [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 110 to 112
> Oct  2 09:57:25 cpli6008 smartd[3473]: Device: /dev/sdb [SAT], SMART 
> Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 115 to 117
> Oct  2 09:57:25 cpli6008 smartd[3473]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 189 High_Fly_Writes changed from 88 to 87
> Oct  2 09:57:25 cpli6008 smartd[3473]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 190 Airflow_Temperature_Cel changed from 60 to 61
> Oct  2 09:57:25 cpli6008 smartd[3473]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 194 Temperature_Celsius changed from 40 to 39
> Oct  2 09:57:25 cpli6008 smartd[3473]: Device: /dev/sdb [SAT], SMART Usage 
> Attribute: 195 Hardware_ECC_Recovered changed from 50 to 51
> 
> {machine stops responding and gets power cycled}
>
> Oct  2 10:10:58 cpli6008 syslog-ng[1948]: syslog-ng starting up; 
> version='2.0.9'
> 
> Do folks think that the hang is NILFS or dodgy hardware/reporting
> from smartd? Is there any advise on getting some debug or status
> information from NILFS to help show it isn't the cause of the
> problem. I would have expected that if it went bang I'd have seen
> something 'worrying' in the log.

The nilfs2 standalone module has a debug mode.  You can enable it by
commenting out the following line (i.e. CONFIG_NILFS_DEBUG=y) in
nilfs2-module/fs/Makefile before compiling:

ifndef CONFIG_NILFS
  EXTERNAL_BUILD=y
  CONFIG_NILFS=m
  # Uncomment below to do debug build.
  CONFIG_NILFS_DEBUG=y
  # Uncomment below to enable bmap validity check.
  #CONFIG_NILFS_BMAP_DEBUG=y
endif

By the way, I'm planning to release nilfs-2.0.17 tomorrow in order to
solve file system corruption problems which infrequently happen and
were reported on this list.

The bugfix was already merged in the mainline and also sent to -stable
trees for 3.6.30 and 3.6.31, but not yet done.

Your problem looks hardware problem to me, but I think the new version
is worth a try.

Cheers,
Ryusuke Konishi
_______________________________________________
users mailing list
[email protected]
https://www.nilfs.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://www.nilfs.org/mailman/listinfo/users

Reply via email to