[Lustre-discuss] 1.6.5 compile probs

2008-06-16 Thread Heiko Schroeter
Hello, i'am trying to compile the 1.6.5 sources against a vanilla-2.6.22.19 kernel. Distrib: gentoo 2008 - setting the links and patching the kernel as described with "quilt push -av" ok. - compiling and booting the kernel ok. - configuring lustre: ./configure --disable-liblustre --enable-quota

Re: [Lustre-discuss] 1.6.5 compile probs

2008-06-18 Thread Heiko Schroeter
Am Dienstag, 17. Juni 2008 15:44:00 schrieb Brian J. Murrell: > On Mon, 2008-06-16 at 12:23 +0200, Heiko Schroeter wrote: > > Hello, > > > > i'am trying to compile the 1.6.5 sources against a vanilla-2.6.22.19 > > kernel. Distrib: gentoo 2008 > > > > -

[Lustre-discuss] Failover Setup MDS/MDT

2008-06-19 Thread Heiko Schroeter
Hello, the docs state for a failover setup that the MDS and MDT should be on seperate nodes. Therefore we would like to know what a "good" scenario could be. Does it make sense or is it even possible to setup a MDS with failover capabilities, i.e. something like (mirrored with DRBD here for ex

Re: [Lustre-discuss] Failover Setup MDS/MDT

2008-06-25 Thread Heiko Schroeter
Am Freitag, 20. Juni 2008 16:20:23 schrieb Bernd Schubert: > On Friday 20 June 2008 16:08:23 Brian J. Murrell wrote: > > On Fri, 2008-06-20 at 16:01 +0200, Bernd Schubert wrote: > > > We do it for several lustre installations and it works fine. > > > > Have you done any "intensive" failover testing

Re: [Lustre-discuss] Failover Setup MDS/MDT

2008-06-25 Thread Heiko Schroeter
Am Mittwoch, 25. Juni 2008 14:19:11 schrieb Brian J. Murrell: > On Wed, 2008-06-25 at 07:36 +0200, Heiko Schroeter wrote: > > How can one determine the size for the MDT partition or is that the same > > as the MDS device ? > > (As far as i can see the MDT takes the DIR info

Re: [Lustre-discuss] Failover Setup MDS/MDT

2008-06-26 Thread Heiko Schroeter
Am Donnerstag, 26. Juni 2008 01:56:23 schrieb Sheila Barthel: > In the current Lustre manual (v. 1_12), section 3.2.2 is Lustre Tools. > Section 21.3.2 is Calculating MDT Size, which includes an inode > calculation example and does not refer to the MDS. > > http://manual.lustre.org/manual/LustreMan

[Lustre-discuss] lustre client 1.6.5.1 hangs

2008-07-10 Thread Heiko Schroeter
Hello, we have a _test_ setup for a lustre 1.6.5.1 installation with 2 Raid Systems (64 Bit Systems) counting for 4 OSTs with 6TB each. One combined MDS and MDT server (32 Bit system , for testing only). OST lustre mkfs: "mkfs.lustre --param="failover.mode=failout" --fsname scia --ost --mkfsop

Re: [Lustre-discuss] lustre client 1.6.5.1 hangs

2008-07-13 Thread Heiko Schroeter
Am Donnerstag, 10. Juli 2008 19:35:57 schrieben Sie: Hi. > > > OST lustre mkfs: > > "mkfs.lustre --param="failover.mode=failout" --fsname > >^^^ > Given this (above) parameter setting... Is 'failout' not ok ? Actually we like to use it because we like t

Re: [Lustre-discuss] lustre client 1.6.5.1 hangs

2008-07-13 Thread Heiko Schroeter
Am Donnerstag, 10. Juli 2008 19:35:57 schrieb Brian J. Murrell: > Well, in fact the du and the copy should both EIO when they get to > trying to write to the unmounted OST. > > Can you get a stack trace (sysrq-t) on the client after you have > unmounted the OST and processes are hung/blocked? Here

[Lustre-discuss] OST crash recovery problem

2008-08-13 Thread Heiko Schroeter
Hello, after a crash (hardware failure) of an OST with two lustre partitions one partition (/dev/sdb) cannot be remounted after restart. The second (/dev/sdc) partition mounts fine. What needs to be done in such a case ? I tried to move the mountpoint because of the "file exists" message but tha

[Lustre-discuss] OST crash recovery problem

2008-08-13 Thread Heiko Schroeter
Hello again, any idea what can be done in such a case ? Regards Heiko Hello, after a crash (hardware failure) of an OST with two lustre partitions one partition (/dev/sdb) cannot be remounted after restart. The second (/dev/sdc) partition mounts fine. What needs to be done in such a case ? I

[Lustre-discuss] HLRN lustre breakdown

2008-08-18 Thread Heiko Schroeter
Hello list, does anyone has more background infos of what happened there ? Regards Heiko HLRN News - Since Mon Aug 18, 2008 12:00 HLRN-II complex Berlin is open for users, again. During the maintenance it turned out that the Lustre file system holding the users $WORK and $TMPDIR wa

[Lustre-discuss] l_getgroups message

2008-08-18 Thread Heiko Schroeter
Hello, from time to time we see these messages on our MDS 1.6.5.1 during copying data onto lustre. Is this just informational or an indicator of a broken setup ? Network load problems ? We checked the group rights and they look ok to us. The lustre MDS system including clients runs with YP se

[Lustre-discuss] OST crash recovery problem

2008-08-19 Thread Heiko Schroeter
Hello, Replying to myself. No we couldn't get lustre up again and had to reinstall from scratch. :-( Keeping fingers crossed now we are running the productive system What bugs us is this part of the message on the MDS: Aug 13 11:18:54 sadosrd20 LustreError: 15c-8: [EMAIL PROTECTED]: The c

Re: [Lustre-discuss] l_getgroups message

2008-08-19 Thread Heiko Schroeter
Am Dienstag, 19. August 2008 01:59:09 schrieb Andreas Dilger: > On Aug 18, 2008 15:46 +0200, Heiko Schroeter wrote: > > from time to time we see these messages on our MDS 1.6.5.1 during > > copying data onto lustre. > > > > Is this just informational or an indicator

[Lustre-discuss] ldlm_cli_cancel_* , 1.6.5.1

2008-08-20 Thread Heiko Schroeter
Hello, we use LUSTRE with AUTOFS to circumvent (hopefully) the evicted client statahead problem and disconnect lustre client when not in use. We get the following messages just before the client is unmounted. It is reproducable and happens every unmount. The functionality of further client mou

Re: [Lustre-discuss] HLRN lustre breakdown

2008-08-21 Thread Heiko Schroeter
whole FS and only during that > action we finally ( after nearly 48 h ) found that bad drive. > > It had nothing to do with the lustre FS itself. Lustre had been the > victim of a HW failure on a Raid6 lun." > > I hope that this helps > > PJones > > Heiko Schroete

Re: [Lustre-discuss] OST crash recovery problem

2008-08-25 Thread Heiko Schroeter
for your effort. Regards Heiko > On Thu, Aug 14, 2008 at 08:40:05AM +0200, Heiko Schroeter wrote: > > What needs to be done in such a case ? > > I tried to move the mountpoint because of the "file exists" message but > > that does not help. > > > > Aug 1

Re: [Lustre-discuss] OST crash recovery problem

2008-08-26 Thread Heiko Schroeter
ld like to know what the cause (besides the power loss) is and how to repair a lustre system in such a case. Heiko > Darn, We are curious what happened now. > > On Mon, Aug 25, 2008 at 9:35 AM, Heiko Schroeter > > <[EMAIL PROTECTED]> wrote: > > Am Montag, 25. August

Re: [Lustre-discuss] OST crash recovery problem

2008-08-26 Thread Heiko Schroeter
quot; :-) > > > > On Tue, Aug 26, 2008 at 9:03 AM, Heiko Schroeter > > <[EMAIL PROTECTED]> wrote: > > Since the new setup everything is running fine. Why ? > > > > Except my backbone, which keeps on etching when something is not quite > > figured

Re: [Lustre-discuss] OST crash recovery problem

2008-08-26 Thread Heiko Schroeter
Am Dienstag, 26. August 2008 15:35:38 schrieb Jeremy Mann: > Mag Gam wrote: > > LOL... I am in the same situation, I want to see what problems other > > people have so I can try to help them and I can further avoid it. I am > > a big proponent of "your problems are my problems" :-) > > When we firs

Re: [Lustre-discuss] drbd async mode

2008-09-22 Thread Heiko Schroeter
Am Dienstag, 2. September 2008 22:24:58 schrieb Andreas Dilger: We are using a setup with the following spec: mds1 and mds2: 2 SATA Disks for MDS inside running raid0 with mdam tools. Above this drbd for network raid0 and Above that HA for failover. Besides that stonith is running inside HA for

Re: [Lustre-discuss] lustre/drbd/heartbeat setup [was: drbd async mode]

2008-10-02 Thread Heiko Schroeter
Am Donnerstag, 2. Oktober 2008 10:32:44 schrieb Wojciech Turek: > Hi, > > Thanks for that.I was thinking about trying drbd on my MDSs so I find your > PDF very useful. Heiko Schroeter wrote: > Hello, > > at last a first version of our setup scenario is ready. > >

[Lustre-discuss] Removing an OST

2009-01-16 Thread Heiko Schroeter
Hello list, the manual (May 2008) describes on page 4-14 how to remove an OST from LUSTRE. Its says "deactivate the OST (make it read-only). Hm, i'am a bit puzzled here. When i deactivate the OST on the MDS using 'lctl --device 18 conf_param foo-OST000d.osc.active=0' the device is deactivated bu

Re: [Lustre-discuss] Removing an OST

2009-01-19 Thread Heiko Schroeter
Thanks very much for the clarification and it works. Actually i wasn't aware of the differences between set_param and config_param. Ashes on me. Is it possible to get these infos into the documentation as it may hit one or the other in the lustre community ? The manual only gives two 'general' a

Re: [Lustre-discuss] Removing an OST

2009-01-20 Thread Heiko Schroeter
Hm, i do got it that one has to 'move' the file back. But ... > > Copy the data on a client to a temp storage outside (does it has to be > > outside?) the lustre system and 'move' them back into lustre to create > > new inodes entries on the MDS. > > Actually, it should NOT be on storage outside L

[Lustre-discuss] Removing OST permantly

2009-01-22 Thread Heiko Schroeter
Hello list, i want to remove an OST permanently on the MDS: lctl --device 11 conf_param foo-OST0006.osc.active=0 In the messages on the MDS i get: Jan 22 13:42:12 mds1 Lustre: foo-OST0006-osc.osc: set parameter active=0 Jan 22 13:42:12 mds1 Lustre: Skipped 1 previous similar message On a client

[Lustre-discuss] lctl does not show inactive status OST on client

2009-01-27 Thread Heiko Schroeter
Hello, after deactivating an OST on the MDS with mds1 ~ # lctl --device 11 conf_param foo-OST0006.osc.active=0 lctl on the mds does mark it as inactive mds1 ~ # lctl dl 0 UP mgs MGS MGS 19 1 UP mgc mgc192.168.16@tcp bd6b1e83-d312-7501-7196-8db675caa078 5 2 UP mdt MDS MDS_uuid 3 3 UP l

[Lustre-discuss] lctl does not show inactive status OST on client

2009-01-30 Thread Heiko Schroeter
Any idea ? Regards Heiko Hello, after deactivating an OST on the MDS with mds1 ~ # lctl --device 11 conf_param foo-OST0006.osc.active=0 lctl on the mds does mark it as inactive mds1 ~ # lctl dl 0 UP mgs MGS MGS 19 1 UP mgc mgc192.168.16@tcp bd6b1e83-d312-7501-7196-8db675caa078 5 2 UP