Re: [Lustre-discuss] 1.8.1.1 -> 1.8.3 upgrade questions

2010-05-26 Thread Gustavsson, Mathias
horized use or disclosure of the contents of this message is not permitted and may be unlawful. -Original Message- From: turek.wojci...@googlemail.com on behalf of Wojciech Turek Sent: Tue 5/25/2010 11:33 PM To: Gustavsson, Mathias Cc: lustre-discuss@lists.lustre.org Subject: Re: [Lustre-di

Re: [Lustre-discuss] 1.8.1.1 -> 1.8.3 upgrade questions

2010-05-26 Thread Gustavsson, Mathias
tre.org Subject: Re: [Lustre-discuss] 1.8.1.1 -> 1.8.3 upgrade questions On 2010-05-25, at 09:01, Gustavsson, Mathias wrote: > We tried to do a 1.8.1.1 to 1.8.3 version upgrade this weekend, but we got > i/o error on all of our old file systems (created ~4 years ago), we have a > m

Re: [Lustre-discuss] 1.8.1.1 -> 1.8.3 upgrade questions

2010-05-25 Thread Wojciech Turek
Those look familiar. Have you run fsck on the OSTs and MDTs before upgrade? Best regards, Wojciech On 25 May 2010 16:01, Gustavsson, Mathias wrote: > Hi, > > We tried to do a 1.8.1.1 to 1.8.3 version upgrade this weekend, but we got > i/o error on all of our old file systems (created ~4 years

Re: [Lustre-discuss] 1.8.1.1 -> 1.8.3 upgrade questions

2010-05-25 Thread Andreas Dilger
On 2010-05-25, at 09:01, Gustavsson, Mathias wrote: > We tried to do a 1.8.1.1 to 1.8.3 version upgrade this weekend, but we got > i/o error on all of our old file systems (created ~4 years ago), we have a > more recently created filesystem (only a couple of moths old) and that was > fine. > Thi

[Lustre-discuss] 1.8.1.1 -> 1.8.3 upgrade questions

2010-05-25 Thread Gustavsson, Mathias
Hi, We tried to do a 1.8.1.1 to 1.8.3 version upgrade this weekend, but we got i/o error on all of our old file systems (created ~4 years ago), we have a more recently created filesystem (only a couple of moths old) and that was fine. This was in the log of the active mds, i couldn't find anythi

[Lustre-discuss] 1.8.1.1 forced cleanup

2010-01-25 Thread DT Piotr Wadas
hm, and what this supposed to mean? :/ srv:~# dmesg |grep -i 'Forced cleanup'|head -1 LustreError: 0-0: Forced cleanup waiting for rlfs-MDT-mdc-c3956400 namespace with 1 resources in use, (rc=-110) srv:~# dmesg |grep -i 'Forced cleanup'|wc -l 101 1.8.1.1 server/client, 32bit, two client s

Re: [Lustre-discuss] 1.8.1.1

2009-12-10 Thread Andreas Dilger
On 2009-12-10, at 16:37, Papp Tamas wrote: > Well, it was not working, and by the way, my guess is that it should > not work. I didn't write, but /usr/src/kernels/ > 2.6.18-128.7.1.el5_lustre.1.8.1.1-x86_64/ belongs to the official > kernel-lustre-devel-2.6.18-128.7.1.el5_lustre.1.8.1.1 packag

Re: [Lustre-discuss] 1.8.1.1

2009-12-10 Thread Papp Tamas
On 2009. 12. 06. 1:47, Andreas Dilger wrote: >> >> ./configure >> --with-linux=/usr/src/kernels/2.6.18-128.7.1.el5_lustre.1.8.1.1-x86_64/ > > > This should be the right kernel for b1_8, according to lustre/ChangeLog > > > This is trying to build the ldiskfs module from the ext3 sources. It > _sh

Re: [Lustre-discuss] 1.8.1.1

2009-12-05 Thread Andreas Dilger
On 2009-12-05, at 10:28, Papp Tamás wrote: > Johann Lombardi wrote, On 2009. 12. 02. 0:18: >> Actually, this is bug 19557 and a patch is pending review. >> I would be delighted if you could give the patch a try. > > How can I rebuild only lustre modules? > > I checked out the b1_8 source tree and a

Re: [Lustre-discuss] 1.8.1.1

2009-12-05 Thread Papp Tamás
Johann Lombardi wrote, On 2009. 12. 02. 0:18: > Actually, this is bug 19557 and a patch is pending review. > I would be delighted if you could give the patch a try. How can I rebuild only lustre modules? I checked out the b1_8 source tree and applied the patched from bugzilla. Then I tried this:

Re: [Lustre-discuss] 1.8.1.1

2009-12-01 Thread Johann Lombardi
On Dec 1, 2009, at 10:51 PM, Papp Tamás wrote: > I'm afraid(?), it's not the same. The stacks still look similar, afaics. > Still do you think, I should recompile the kernel with the change > above? Actually, this is bug 19557 and a patch is pending review. I would be delighted if you could gi

Re: [Lustre-discuss] 1.8.1.1

2009-12-01 Thread Papp Tamás
Andreas Dilger wrote, On 2009. 11. 28. 0:22: > On 2009-11-27, at 03:13, Papp Tamás wrote: >> Craig Prescott wrote, On 2009. 11. 19. 20:42: >>> We had the same problem with 1.8.x.x. >>> >>> We set lnet.printk=0 on our OSS nodes and it has helped us >>> dramatically - we have not seen the 'soft lock

Re: [Lustre-discuss] 1.8.1.1

2009-11-27 Thread Andreas Dilger
On 2009-11-27, at 03:13, Papp Tamás wrote: > Craig Prescott wrote, On 2009. 11. 19. 20:42: >> We had the same problem with 1.8.x.x. >> >> We set lnet.printk=0 on our OSS nodes and it has helped us >> dramatically - we have not seen the 'soft lockup' problem since. >> >> sysctl -w lnet.printk=0 >>

Re: [Lustre-discuss] 1.8.1.1

2009-11-27 Thread Papp Tamás
Craig Prescott wrote, On 2009. 11. 19. 20:42: We had the same problem with 1.8.x.x. We set lnet.printk=0 on our OSS nodes and it has helped us dramatically - we have not seen the 'soft lockup' problem since. sysctl -w lnet.printk=0 This will turn off all but 'emerg' messages from lnet. It w

Re: [Lustre-discuss] 1.8.1.1

2009-11-19 Thread Papp Tamás
Craig Prescott wrote, On 2009. 11. 19. 20:42: > Papp Tamás wrote: >> The logs are full with this: >> >> Nov 19 20:03:32 node1 kernel: BUG: soft lockup - CPU#3 stuck for 10s! >> [ll_ost_80:4894] >> Nov 19 20:03:32 node1 kernel: CPU 3: > >> Nov 19 20:03:34 node1 kernel: Lustre: Skipped 40339060 pre

Re: [Lustre-discuss] 1.8.1.1

2009-11-19 Thread Craig Prescott
Papp Tamás wrote: > The logs are full with this: > > Nov 19 20:03:32 node1 kernel: BUG: soft lockup - CPU#3 stuck for 10s! > [ll_ost_80:4894] > Nov 19 20:03:32 node1 kernel: CPU 3: > Nov 19 20:03:34 node1 kernel: Lustre: Skipped 40339060 previous similar > messages 0; still busy with 3 active R

Re: [Lustre-discuss] 1.8.1.1

2009-11-10 Thread Papp Tamás
Derek Yarnell wrote, On 2009. 11. 10. 7:19: > This looks exactly like what we are running into, my bugzilla > (https://bugzilla.lustre.org/show_bug.cgi?id=21256) was duped to this > bug, > > https://bugzilla.lustre.org/show_bug.cgi?id=19557 > > But I am not sure until you reported it that you a

Re: [Lustre-discuss] 1.8.1.1

2009-11-09 Thread Derek Yarnell
This looks exactly like what we are running into, my bugzilla (https://bugzilla.lustre.org/show_bug.cgi?id=21256 ) was duped to this bug, https://bugzilla.lustre.org/show_bug.cgi?id=19557 But I am not sure until you reported it that you are also seeing that your network stack is being shut

Re: [Lustre-discuss] 1.8.1.1

2009-11-09 Thread Papp Tamás
Papp Tamás wrote, On 2009. 11. 09. 19:34: > > Is there any known bug about this? > Is this bug 21221 ? 10x tompos ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustr

[Lustre-discuss] 1.8.1.1

2009-11-09 Thread Papp Tamás
hi All, First off all, I'm sorry, I cannot write a better bug report as I'm far away from the host, and right now there is no remote access. My colleague send this in by email: Nov 9 19:05:55 node1 kernel: [] :ptlrpc:ptlrpc_server_handle_request+0xa97/0x1170 Nov 9 19:05:55 node1 kernel: []

Re: [Lustre-discuss] 1.8.1.1 write slow performance :/

2009-11-09 Thread Brian J. Murrell
On Sun, 2009-11-08 at 21:52 +0100, Piotr Wadas wrote: > I just did some speed tests between client and filesystem server, > with dedicated GbitEthernet connection, I compared uploading via > lustre-mounted share, and uploading to the same share, mounted > as loopback lustre client on filesystem se

Re: [Lustre-discuss] 1.8.1.1 write slow performance :/

2009-11-08 Thread Piotr Wadas
One more thing - server is lustre 1.8.1.1, and client is lustre 1.8.1, didn't upgraded kernels on clients yet. DT -- Linux aleft 2.6.27.29-0.1_lustre.1.8.1.1-default #1 SMP drbd 8.3.5-(api:88/proto:86-91) pacemaker 1.0.6-cebe2b6ff49b36b29a3bd7ada1c4701c7470febe Lustre 1.8.1.1-20091009080716-PRIS

Re: [Lustre-discuss] 1.8.1.1 write slow performance :/

2009-11-08 Thread Piotr Wadas
To be exact, I got similar rates with hundred files , one megabyte each. so it's rather not about size of the files. Lustre client rate is 100+ download and 4.9 upload, and nfs => remote nfs => local lustre client => local lustre server, via the same ethernet interface, is 180+ download and ~100

[Lustre-discuss] 1.8.1.1 write slow performance :/

2009-11-08 Thread Piotr Wadas
-- Linux aleft 2.6.27.29-0.1_lustre.1.8.1.1-default #1 SMP drbd 8.3.5-(api:88/proto:86-91) pacemaker 1.0.6-cebe2b6ff49b36b29a3bd7ada1c4701c7470febe Lustre 1.8.1.1-20091009080716-PRISTINE-2.6.27.29-0.1_lustre.1.8.1.1-default Well, I'v setup everything using 64bit kernel, for now I got ~4 TB of u

Re: [Lustre-discuss] 1.8.1.1, 2.6.27.29-0.1_lustre.1.8.1.1-default and HIGHMEM 64G

2009-11-07 Thread Andreas Dilger
On 2009-11-07, at 09:17, Piotr Wadas wrote: > Hello. Is there any problem with 8GB RAM and lustre ? Default binary > lustre-enabled 2.6.27.29 kernel for SLES11 provided by SUN has 4GB > limit. Are you sure you are using the x86_64 kernel and not the i386 kernel? We have lots of customers usin

[Lustre-discuss] 1.8.1.1, 2.6.27.29-0.1_lustre.1.8.1.1-default and HIGHMEM 64G

2009-11-07 Thread Piotr Wadas
Hello. Is there any problem with 8GB RAM and lustre ? Default binary lustre-enabled 2.6.27.29 kernel for SLES11 provided by SUN has 4GB limit. I tried to rebuild from appropriate linux-patched-source, also provided by SUN, with only this kernel option changed, and I discovered, that despite of