Re: [Lustre-discuss] MDS inode allocation question
Not sure if it was fixed, but there was a bug in Lustre returning the wrong values here. If you create a bunch of files, the number of inodes reported should go up until you get where you expect it to be. Note that the number of inodes on the OSTs also limits the number of creatable files: each file requires an inodes on at least one OST (number depends on how many OSTs each file is striped across). Kevin Gary Molenkamp wrote: > When creating the MDS filesystem, I used '-i 1024' on a 860GB logical > drive to provide approx 800M inodes in the lustre filesystem. This was > then verified with 'df -i' on the server: > > /dev/sda86016 130452 8600295481% /data/mds > > Later, after completing the OST creation and mounting the full > filesystem on a client, I noticed that 'df -i' on the client mount is > only showing 108M inodes in the lfs: > > 10.18.1...@tcp:10.18.1...@tcp:/gulfwork > 107454606 130452 1073241541% /gulfwork > > A check with 'lfs df -i' shows the MDT only has 108M inodes: > > gulfwork-MDT_UUID 107454606130452 1073241540% > /gulfwork[MDT:0] > > Is there a preallocation mechanism in play here, or did I miss something > critical in the initial setup? My concern is that modifications to the > inodes are not reconfigurable, so it must be correct before the > filesystem goes into production. > > FYI, the filesystem was created with: > > MDS/MGS on 880G logical drive: > mkfs.lustre --fsname gulfwork --mdt --mgs --mkfsoptions='-i 1024' > --failnode=10.18.12.1 /dev/sda > > OSSs on 9.1TB logical drives: > /usr/sbin/mkfs.lustre --fsname gulfwork --ost --mgsnode=10.18.1...@tcp > --mgsnode=10.18.1...@tcp /dev/cciss/c0d0 > > Thanks. > > ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] MDS inode allocation question
When creating the MDS filesystem, I used '-i 1024' on a 860GB logical drive to provide approx 800M inodes in the lustre filesystem. This was then verified with 'df -i' on the server: /dev/sda86016 130452 8600295481% /data/mds Later, after completing the OST creation and mounting the full filesystem on a client, I noticed that 'df -i' on the client mount is only showing 108M inodes in the lfs: 10.18.1...@tcp:10.18.1...@tcp:/gulfwork 107454606 130452 1073241541% /gulfwork A check with 'lfs df -i' shows the MDT only has 108M inodes: gulfwork-MDT_UUID 107454606130452 1073241540% /gulfwork[MDT:0] Is there a preallocation mechanism in play here, or did I miss something critical in the initial setup? My concern is that modifications to the inodes are not reconfigurable, so it must be correct before the filesystem goes into production. FYI, the filesystem was created with: MDS/MGS on 880G logical drive: mkfs.lustre --fsname gulfwork --mdt --mgs --mkfsoptions='-i 1024' --failnode=10.18.12.1 /dev/sda OSSs on 9.1TB logical drives: /usr/sbin/mkfs.lustre --fsname gulfwork --ost --mgsnode=10.18.1...@tcp --mgsnode=10.18.1...@tcp /dev/cciss/c0d0 Thanks. -- Gary Molenkamp SHARCNET Systems Administrator University of Western Ontario g...@sharcnet.cahttp://www.sharcnet.ca (519) 661-2111 x88429 (519) 661-4000 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] Moving files off an OST
Appolgies if Im missing something obvious here. My OSTs are set up in Raid 5 and one of the arrays has a bad stripe so i need to rebuild it. In preparation for this i want to move all the data off of this OST so i deactivated the OST on the MDS and ran: lfs find --recursive --obd nasone-OST0002_UUID --quiet /lustre | while read F; cp $F $F.tmp && mv $F.tmp $F; done This ran for quite a while and after it finished i ran the find command again to confirm there were no more files on the OST. However if i look at the OSS i still show there are 3.4TBs of used space on that OST2: # df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdd 5765425880 5223515676 249043440 96% /mnt/ost4 /dev/sdc 4804519904 3479755816 1080708536 77% /mnt/ost2 Does this make any sense at all or am i missing something obvious here? I was expecting (hoping) to see the used space back to almost zero so does this mean i have quite a bit of lost data? Any help? Regards ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] 1.8.2 "make debs" for 2.6.22.19
On Fri, 2010-04-23 at 08:22 -0500, Hendelman, Rob wrote: > checking build system type... x86_64-unknown-linux-gnu > checking host system type... x86_64-unknown-linux-gnu > checking target system type... x86_64-unknown-linux-gnu > checking for a BSD-compatible install... /usr/bin/install -c > checking whether build environment is sane... yes > checking for gawk... gawk > checking whether make sets $(MAKE)... yes > checking for gcc... gcc- > checking for C compiler default output file name... configure: error: C > compiler cannot create executables > See `config.log' for more details. Looks like something in your environment is confusing configure about what your compiler is. You can override that test simply by setting CC="gcc" (assuming your compiler is gcc and in $PATH) and exporting it before running configure (or make debs I suppose). $ export CC=gcc $ make debs Other than that, you could debug why configure is getting confused. b. signature.asc Description: This is a digitally signed message part ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Future of LusterFS?
Yes, we suffer hardware failures. All the time. That is sort of the point of Lustre and a clustered file system :) We have had double-disk failures with raid5 (recovered everything except ~1MB of data), server failures, MDS failures etc. We successfully recovered from them all. Sure, it can be a little stressful... but it all works. If server hardware fails, our file systems basically hangs until we fix it. Our most common failure is obviously disks... and they are all covered by raid. Since we have mostly direct attached disk, you have a few minutes downtime of a server while you replace the disk... everything continues as normal when the server comes back. -- Dr Stuart Midgley sdm...@gmail.com On 23/04/2010, at 18:41 , Janne Aho wrote: > On 23/04/10 11:42, Stu Midgley wrote: > >>> Would lustre have issues if using cheap off the shell components or >>> would people here think you need to have high end machines with built in >>> redundancy for everything? >> >> We run lustre on cheap off the shelf gear. We have 4 generations of >> cheapish gear in a single 300TB lustre config (40 oss's) >> >> It has been running very very well for about 3.5 years now. > > This sounds promising. > > Have you had any hardware failures? > If yes, how well has the cluster cooped with the loss of the machine(s)? > > > Any advice you can share from your initial setup of lustre? ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Future of LusterFS?
Our success is based on simplicity. Software raid on direct attached disks with no add-on cards (ie. ensure MB's have intel pro 1000 nics, at least 6 sata ports and reliable cpu's etc). Our first generation gear consisted of a super-micro MB, 2GB memory single dual core intel cpu's and 6x750GB direct attached disks in a white-box chassis running software raid 5. That was over 3.5years ago and it will actually be decommissioned tomorrow. 2nd generation were the same boxes, just the latest super-micro MB. 3rd generation were SGI xe250's with 8x1TB direct attached disks with software raid5. 4th generation are SGI/Rackable systems with 12x2TB disks with an LSI/3ware hardware raid6 card. We absolutely hammer our file system and it has stood the test of time. I think our latest gear went in for ~$420/TB. -- Dr Stuart Midgley sdm...@gmail.com On 23/04/2010, at 23:17 , Troy Benjegerdes wrote: > Taking a break from my current non-computer related work.. > > My guess based on your success is your gear is not so much cheap, as > *cost effective high MTBF commodity parts*. > > If you go for the absolute bargain basement stuff, you'll have problems > as individual components flake out. > > If you spend way too much money on high-end multi-redundant whizbangs, > you generally get two things.. redundancy, which in my mind often only > serves to make the eventual failure worse, and high-quality, long MTBF > components. > > If you can get the high MTBF components without all the redudancy > (and associated complexity nightmare), then you win. > > > On Fri, Apr 23, 2010 at 05:42:30PM +0800, Stu Midgley wrote: >> We run lustre on cheap off the shelf gear. We have 4 generations of >> cheapish gear in a single 300TB lustre config (40 oss's) >> >> It has been running very very well for about 3.5 years now. >> >> >>> Would lustre have issues if using cheap off the shell components or >>> would people here think you need to have high end maskines with built in >>> redundancy for everything? ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Future of LusterFS?
Taking a break from my current non-computer related work.. My guess based on your success is your gear is not so much cheap, as *cost effective high MTBF commodity parts*. If you go for the absolute bargain basement stuff, you'll have problems as individual components flake out. If you spend way too much money on high-end multi-redundant whizbangs, you generally get two things.. redundancy, which in my mind often only serves to make the eventual failure worse, and high-quality, long MTBF components. If you can get the high MTBF components without all the redudancy (and associated complexity nightmare), then you win. On Fri, Apr 23, 2010 at 05:42:30PM +0800, Stu Midgley wrote: > We run lustre on cheap off the shelf gear. We have 4 generations of > cheapish gear in a single 300TB lustre config (40 oss's) > > It has been running very very well for about 3.5 years now. > > > > Would lustre have issues if using cheap off the shell components or > > would people here think you need to have high end maskines with built in > > redundancy for everything? > Troy Benjegerdes 'da hozer'ho...@hozed.org CTO, Freedom Fertilizer, Sustainable wind to NH3, t...@freedomfertilizer.com Benjegerdes Farms TerraCarbo biofuels The challenge in changing the world is not in having great ideas, it's in having stupid simple ideas, as those are the ones that cause change. Intellectual property is one of those great complicated ideas that intellectuals like to intellectualize over, lawyers like to bill too much over, and engineers like to overengineer. Meanwhile, it's the stupid simple ideas underfoot that create wealth. -- Troy, Mar 2010 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Kernel oops after cat on /proc/fs/lustre/mgs/MGS/exports/*/stats
Hi, This is a known bug that is fixed in 1.8.2 https://bugzilla.lustre.org/show_bug.cgi?id=21420 Best regards Wojciech On 23 April 2010 13:18, Christopher Huhn wrote: > Dear lustre wizards, > > we are experiencing problems on our MDS and our Lustre expert is abroad > (he just attended LUG meeting). > > One of the symptoms we observe are reproducible kernel oopses when > viewing some stats files beneath /proc/fs/lustre/mgs/MGS/exports : > >mds:~# cat /proc/fs/lustre/mgs/MGS/exports/10.12...@tcp/stats >Killed >mds:~# mds kernel: Oops: [38] SMP >Apr 23 13:23:19 mds kernel: Unable to handle kernel paging request >at 00040024 RIP: >Apr 23 13:23:19 mds kernel: [] >:obdclass:lprocfs_stats_seq_show+0x80/0x1e0 >Apr 23 13:23:19 mds kernel: PGD 203067 PUD 0 >Apr 23 13:23:19 mds kernel: Oops: [38] SMP >Apr 23 13:23:20 mds kernel: CPU 7 >Apr 23 13:23:20 mds kernel: Modules linked in: mds fsfilt_ldiskfs(F) >mgs mgc ldiskfs crc16 lustre lov mdc lquota osc ksocklnd ptlrpc >obdclass lnet lvfs libcfs xt_tcpudp iptable_filter ip_tables >x_tables drbd cn button ac battery bonding xfs ipmi_si ipmi_devintf >ipmi_msghandler serio_raw psmouse joydev pcspkr i2c_i801 i2c_core >shpchp pci_hotplug evdev parport_pc parport ext3 jbd mbcache >dm_mirror dm_snapshot dm_mod raid10 raid456 xor raid1 raid0 >multipath linear md_mod sd_mod ide_cd cdrom ata_generic libata >generic usbhid hid piix 3w_9xxx floppy ide_core ehci_hcd uhci_hcd >e1000 scsi_mod thermal processor fan >Apr 23 13:23:20 mds kernel: Pid: 7293, comm: cat Tainted: GF >2.6.22+lustre1.6.7.2+0.credativ.etch.1 #2 >Apr 23 13:23:20 mds kernel: RIP: 0010:[] >[] :obdclass:lprocfs_stats_seq_show+0x80/0x1e0 >Apr 23 13:23:20 mds kernel: RSP: 0018:8103ba5f9e48 EFLAGS: 00010282 >Apr 23 13:23:20 mds kernel: RAX: 00040004 RBX: >7fff RCX: 0006 >Apr 23 13:23:20 mds kernel: RDX: 0101010101010101 RSI: > RDI: >Apr 23 13:23:20 mds kernel: RBP: R08: >0008 R09: >Apr 23 13:23:20 mds kernel: R10: R11: > R12: >Apr 23 13:23:20 mds kernel: R13: R14: > R15: 8108000a1760 >Apr 23 13:23:20 mds kernel: FS: 2b4a366786d0() >GS:81081004b840() knlGS: >Apr 23 13:23:20 mds kernel: CS: 0010 DS: ES: CR0: >8005003b >Apr 23 13:23:20 mds kernel: CR2: 00040024 CR3: >00078f018000 CR4: 06e0 >Apr 23 13:23:20 mds kernel: Process cat (pid: 7293, threadinfo >8103ba5f8000, task 8107dc299530) >Apr 23 13:23:20 mds kernel: Stack: 0202 > 00040004 81067dae2640 >Apr 23 13:23:20 mds kernel: 4bd18327 000ca54d > 81067dae2640 >Apr 23 13:23:20 mds kernel: 00040004 00040004 >0400 >Apr 23 13:23:20 mds kernel: Call Trace: >Apr 23 13:23:20 mds kernel: [] seq_read+0x105/0x28d >Apr 23 13:23:20 mds kernel: [] vfs_read+0xcb/0x153 >Apr 23 13:23:20 mds kernel: [] sys_read+0x45/0x6e >Apr 23 13:23:20 mds kernel: [] system_call+0x7e/0x83 >Apr 23 13:23:20 mds kernel: >Apr 23 13:23:20 mds kernel: >Apr 23 13:23:20 mds kernel: Code: 48 8b 50 20 48 8b 48 28 4c 03 60 >10 4c 03 68 18 48 39 d3 48 >Apr 23 13:23:20 mds kernel: RIP [] >:obdclass:lprocfs_stats_seq_show+0x80/0x1e0 > mds kernel: CR2: 00040024 >Apr 23 13:23:20 mds kernel: RSP >Apr 23 13:23:20 mds kernel: CR2: 00040024 > > > Server and affected client both run Lustre 1.6.7.2 on Debian Etch/x86_64 > in this case. The behavior does not change after a client reboot. > > All hints on how to solve this are really appreciated. > > Kind regards, >Christopher > > -- > Christopher Huhn > Linux therapist > > GSI Helmholtzzentrum fuer Schwerionenforschung GmbH > Planckstr. 1 > 64291 Darmstadt > http://www.gsi.de/ > > Gesellschaft mit beschraenkter Haftung > > Sitz der Gesellschaft / Registered Office:Darmstadt > Handelsregister / Commercial Register: >Amtsgericht Darmstadt, HRB 1528 > > Geschaeftsfuehrung/ Managing Directors: > Professor Dr. Dr. h.c. Horst Stoecker, >Christiane Neumann, > Dr. Hartmut Eickhoff > Vorsitzende des Aufsichtsrates / Supervisory Board Chair: > Dr. Beatrix Vierkorn-Rudolph > Stellvertreter/ Deputy Chair: Dr. Rolf Bernhard > > > ___ > Lustre-discuss mailing list > Lustre-discuss@lis
Re: [Lustre-discuss] 1.8.2 "make debs" for 2.6.22.19
Good morning Mr. Murrell & List, I attempted this again & your patch did seem to fix that particular problem. Thanks for the patch. Since I originally posted that question, I've switched to Ubuntu 8.04.4 with the included build system (the previous temporary build machine was recycled...) r...@mag-hardy-change:/usr/src/lustre-1.8.2# lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 8.04.4 LTS Release:8.04 Codename: hardy I then applied the patch you posted & did the following: ./configure --with-linux=/usr/src/linux (symlink to linux-2.6.22.19) Make Make debs This gets to: === # touch files to same date, to avoid auto* find . -type f -print0 | xargs -0 touch -r COPYING; \ if [ "." != "." ]; then \ mkdir -p ./build ./lustre/contrib ./libsysio; \ cp build/Makefile ./build/; \ cp lustre/contrib/mpich-*.patch ./lustre/contrib/; \ ln -s ../../../libsysio/include ./libsysio/; \ fi ( cd . && \ ./configure --disable-dependency-tracking \ --disable-modules \ --disable-snmp \ --disable-client \ --enable-quota \ --disable-server ) checking build system type... x86_64-unknown-linux-gnu checking host system type... x86_64-unknown-linux-gnu checking target system type... x86_64-unknown-linux-gnu checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for gawk... gawk checking whether make sets $(MAKE)... yes checking for gcc... gcc- checking for C compiler default output file name... configure: error: C compiler cannot create executables See `config.log' for more details. make[1]: *** [configure-stamp] Error 77 make[1]: Leaving directory `/usr/src/lustre-1.8.2' dpkg-buildpackage: failure: debian/rules build gave error exit status 2 make: *** [debs] Error 2 === Apparently it thinks my compiler is gcc- instead of "gcc" ? Config.log shows: This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. It was created by Lustre configure LUSTRE_VERSION, which was generated by GNU Autoconf 2.59. Invocation command line was $ ./configure --disable-dependency-tracking --disable-modules --disable-snmp --disable-client --enable-quota --disable-server ## - ## ## Platform. ## ## - ## hostname = mag-hardy-change uname -m = x86_64 uname -r = 2.6.24-27-server uname -s = Linux uname -v = #1 SMP Wed Mar 24 11:32:39 UTC 2010 /usr/bin/uname -p = unknown /bin/uname -X = unknown /bin/arch = unknown /usr/bin/arch -k = unknown /usr/convex/getsysinfo = unknown hostinfo = unknown /bin/machine = unknown /usr/bin/oslevel = unknown /bin/universe = unknown PATH: /usr/share/modass/gcc-4.2 PATH: /usr/local/sbin PATH: /usr/local/bin PATH: /usr/sbin PATH: /usr/bin PATH: /sbin PATH: /bin PATH: /usr/games ## --- ## ## Core tests. ## ## --- ## configure:1509: checking build system type configure:1527: result: x86_64-unknown-linux-gnu configure:1535: checking host system type configure:1549: result: x86_64-unknown-linux-gnu configure:1557: checking target system type configure:1571: result: x86_64-unknown-linux-gnu configure:1600: checking for a BSD-compatible install configure:1655: result: /usr/bin/install -c configure:1666: checking whether build environment is sane configure:1709: result: yes configure:1742: checking for gawk configure:1758: found /usr/bin/gawk configure:1768: result: gawk configure:1778: checking whether make sets $(MAKE) configure:1798: result: yes configure:2010: checking for gcc configure:2036: result: gcc- configure:2280: checking for C compiler version configure:2283: gcc- --version &5 ./configure: line 2284: gcc-: command not found configure:2286: $? = 127 configure:2288: gcc- -v &5 ./configure: line 2289: gcc-: command not found configure:2291: $? = 127 configure:2293: gcc- -V &5 ./configure: line 2294: gcc-: command not found configure:2296: $? = 127 configure:2319: checking for C compiler default output file name configure:2322: gcc- -Wall -g -O2 -O2 -Wl,-Bsymbolic-functions conftest.c >&5 ./configure: line 2323: gcc-: command not found configure:2325: $? = 127 configure: failed program was: | /* confdefs.h. */ | | #define PACKAGE_NAME "Lustre" | #define PACKAGE_TARNAME "lustre" | #define PACKAGE_VERSION "LUSTRE_VERSION" | #define PACKAGE_STRING "Lustre LUSTRE_VERSION" | #define PACKAGE_BUGREPORT "https://bugzilla.lustre.org/"; | #define PACKAGE "lustre" | #define VERSION "1.8.2" | /* end confdefs.h. */ | | int | main () | { | | ; | return 0; | } configure:2364: error: C compiler cannot create executables See `config.log' for more details. ## ## ## Cache variables. ## ## ## ac_cv_build=x86_64-unknown-li
[Lustre-discuss] Kernel oops after cat on /proc/fs/lustre/mgs/MGS/exports/*/stats
Dear lustre wizards, we are experiencing problems on our MDS and our Lustre expert is abroad (he just attended LUG meeting). One of the symptoms we observe are reproducible kernel oopses when viewing some stats files beneath /proc/fs/lustre/mgs/MGS/exports : mds:~# cat /proc/fs/lustre/mgs/MGS/exports/10.12...@tcp/stats Killed mds:~# mds kernel: Oops: [38] SMP Apr 23 13:23:19 mds kernel: Unable to handle kernel paging request at 00040024 RIP: Apr 23 13:23:19 mds kernel: [] :obdclass:lprocfs_stats_seq_show+0x80/0x1e0 Apr 23 13:23:19 mds kernel: PGD 203067 PUD 0 Apr 23 13:23:19 mds kernel: Oops: [38] SMP Apr 23 13:23:20 mds kernel: CPU 7 Apr 23 13:23:20 mds kernel: Modules linked in: mds fsfilt_ldiskfs(F) mgs mgc ldiskfs crc16 lustre lov mdc lquota osc ksocklnd ptlrpc obdclass lnet lvfs libcfs xt_tcpudp iptable_filter ip_tables x_tables drbd cn button ac battery bonding xfs ipmi_si ipmi_devintf ipmi_msghandler serio_raw psmouse joydev pcspkr i2c_i801 i2c_core shpchp pci_hotplug evdev parport_pc parport ext3 jbd mbcache dm_mirror dm_snapshot dm_mod raid10 raid456 xor raid1 raid0 multipath linear md_mod sd_mod ide_cd cdrom ata_generic libata generic usbhid hid piix 3w_9xxx floppy ide_core ehci_hcd uhci_hcd e1000 scsi_mod thermal processor fan Apr 23 13:23:20 mds kernel: Pid: 7293, comm: cat Tainted: GF 2.6.22+lustre1.6.7.2+0.credativ.etch.1 #2 Apr 23 13:23:20 mds kernel: RIP: 0010:[] [] :obdclass:lprocfs_stats_seq_show+0x80/0x1e0 Apr 23 13:23:20 mds kernel: RSP: 0018:8103ba5f9e48 EFLAGS: 00010282 Apr 23 13:23:20 mds kernel: RAX: 00040004 RBX: 7fff RCX: 0006 Apr 23 13:23:20 mds kernel: RDX: 0101010101010101 RSI: RDI: Apr 23 13:23:20 mds kernel: RBP: R08: 0008 R09: Apr 23 13:23:20 mds kernel: R10: R11: R12: Apr 23 13:23:20 mds kernel: R13: R14: R15: 8108000a1760 Apr 23 13:23:20 mds kernel: FS: 2b4a366786d0() GS:81081004b840() knlGS: Apr 23 13:23:20 mds kernel: CS: 0010 DS: ES: CR0: 8005003b Apr 23 13:23:20 mds kernel: CR2: 00040024 CR3: 00078f018000 CR4: 06e0 Apr 23 13:23:20 mds kernel: Process cat (pid: 7293, threadinfo 8103ba5f8000, task 8107dc299530) Apr 23 13:23:20 mds kernel: Stack: 0202 00040004 81067dae2640 Apr 23 13:23:20 mds kernel: 4bd18327 000ca54d 81067dae2640 Apr 23 13:23:20 mds kernel: 00040004 00040004 0400 Apr 23 13:23:20 mds kernel: Call Trace: Apr 23 13:23:20 mds kernel: [] seq_read+0x105/0x28d Apr 23 13:23:20 mds kernel: [] vfs_read+0xcb/0x153 Apr 23 13:23:20 mds kernel: [] sys_read+0x45/0x6e Apr 23 13:23:20 mds kernel: [] system_call+0x7e/0x83 Apr 23 13:23:20 mds kernel: Apr 23 13:23:20 mds kernel: Apr 23 13:23:20 mds kernel: Code: 48 8b 50 20 48 8b 48 28 4c 03 60 10 4c 03 68 18 48 39 d3 48 Apr 23 13:23:20 mds kernel: RIP [] :obdclass:lprocfs_stats_seq_show+0x80/0x1e0 mds kernel: CR2: 00040024 Apr 23 13:23:20 mds kernel: RSP Apr 23 13:23:20 mds kernel: CR2: 00040024 Server and affected client both run Lustre 1.6.7.2 on Debian Etch/x86_64 in this case. The behavior does not change after a client reboot. All hints on how to solve this are really appreciated. Kind regards, Christopher -- Christopher Huhn Linux therapist GSI Helmholtzzentrum fuer Schwerionenforschung GmbH Planckstr. 1 64291 Darmstadt http://www.gsi.de/ Gesellschaft mit beschraenkter Haftung Sitz der Gesellschaft / Registered Office:Darmstadt Handelsregister / Commercial Register: Amtsgericht Darmstadt, HRB 1528 Geschaeftsfuehrung/ Managing Directors: Professor Dr. Dr. h.c. Horst Stoecker, Christiane Neumann, Dr. Hartmut Eickhoff Vorsitzende des Aufsichtsrates / Supervisory Board Chair: Dr. Beatrix Vierkorn-Rudolph Stellvertreter/ Deputy Chair: Dr. Rolf Bernhard ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Future of LusterFS?
On 23/04/10 11:42, Stu Midgley wrote: >> Would lustre have issues if using cheap off the shell components or >> would people here think you need to have high end machines with built in >> redundancy for everything? > > We run lustre on cheap off the shelf gear. We have 4 generations of > cheapish gear in a single 300TB lustre config (40 oss's) > > It has been running very very well for about 3.5 years now. This sounds promising. Have you had any hardware failures? If yes, how well has the cluster cooped with the loss of the machine(s)? Any advice you can share from your initial setup of lustre? -- Janne Aho (Developer) | City Network Hosting AB - www.citynetwork.se Phone: +46 455 690022 | Cell: +46 733 312775 EMail/MSN: ja...@citynetwork.se ICQ: 567311547 | Skype: janne_mz | AIM: janne4cn | Gadu: 16275665 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Future of LusterFS?
We run lustre on cheap off the shelf gear. We have 4 generations of cheapish gear in a single 300TB lustre config (40 oss's) It has been running very very well for about 3.5 years now. > Would lustre have issues if using cheap off the shell components or > would people here think you need to have high end maskines with built in > redundancy for everything? -- Dr Stuart Midgley sdm...@gmail.com ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Future of LusterFS?
On 22/04/10 17:38, Lundgren, Andrew wrote: (somehow managed to send this as private mail, while it was ment to be sent to the list) sorry being old fashioned and answer inline, but it feels less jeopardy. > I think the lustre 2.0 release notes indicated that lustre will continue but > may only be supported on Oracle hardware by Oracle. > If you are doing anything else, it seemed like you would be on your own. In our economical calculation there ain't much space for support fees and we would getting some 3rd party support from one of our partners. > That said, http://www.clusterstor.com/ is a new company founded by Peter > Braum (the guy who invented Lustre). > They are creating a new cluster file system as well as supporting Lustre. > They have a customers link off of their website that indicates some of the > notables. Interesting, but a bit hefty price tag for us. > There is a possibility that there will be a lustre fork in the future. > Some following Oracle's "opensource" model and the other following the more > traditional model. After reading "After the Software wars" by Keith Curtis, in the long run, I think I'll be betting on the open source project than the closed one. We are still here talking a bit about LustreFS vs GlusterFS, as the first time we will be using a cluster file system it feels quite difficult what to choose and the same time we need to keep the total cost as low as possible. Would lustre have issues if using cheap off the shell components or would people here think you need to have high end maskines with built in redundancy for everything? -- Janne Aho (Developer) | City Network Hosting AB - www.citynetwork.se Phone: +46 455 690022 | Cell: +46 733 312775 EMail/MSN: ja...@citynetwork.se ICQ: 567311547 | Skype: janne_mz | AIM: janne4cn | Gadu: 16275665 ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss