[Lustre-discuss] problem with secondary groups

2012-05-25 Thread Temple Jason
Hello, I am running lustre 2.1.56 on the server side, and 1.8.4 (cray) on the client side. I am having the classical secondary group problem, but when I enable it on the mds vi lctl conf_param lustre-MDT.mdt.identity_upcall=/usr/sbin/l_getidentity, I still have the same permissions proble

Re: [Lustre-discuss] Swap over lustre

2011-08-17 Thread Temple Jason
Hello, I experimented with swap on lustre in as many ways as possible (without touching the code), and had the shortest path possible to no avail. The code is not able to handle it at all, and the system always hung. Without serious code rewrites, this isn't going to work for you. -Jason ---

Re: [Lustre-discuss] software raid

2011-03-24 Thread Temple Jason
I believe that software raid has a historical bias. I use software raid exclusively for my lustre installations here, and have never seen any problem with it. The argument used to be that having dedicated hardware running your raids removed any overhead from the OS having to control them, and

Re: [Lustre-discuss] "up" a router that is marked "down"

2011-01-25 Thread Temple Jason
I've found that even with the Protocal Error, it still works. -Jason -Original Message- From: lustre-discuss-boun...@lists.lustre.org [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Michael Shuey Sent: martedì, 25. gennaio 2011 14:45 To: Michael Kluge Cc: Lustre Diskussions

Re: [Lustre-discuss] status of lustre 2.0 on 2.6.18-194.17.1.0.1.el5 kernels

2011-01-11 Thread Temple Jason
I mean this article. Forgot to attach it: http://feedproxy.google.com/~r/InsideHPC/~3/LI9iHNGoFZw/ -Original Message- From: lustre-discuss-boun...@lists.lustre.org [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Andreas Dilger Sent: martedì, 11. gennaio 2011 21:55 To: Mich

Re: [Lustre-discuss] status of lustre 2.0 on 2.6.18-194.17.1.0.1.el5 kernels

2011-01-11 Thread Temple Jason
And what impact do you forsee from this article? -Original Message- From: lustre-discuss-boun...@lists.lustre.org [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Andreas Dilger Sent: martedì, 11. gennaio 2011 21:55 To: Michael Shuey Cc: lustre-discuss@lists.lustre.org Subjec

Re: [Lustre-discuss] Multihome question : unable to mount lustre over tcp.

2010-12-10 Thread Temple Jason
Hi, You need to run tunefs.lustre on all the servers to add the new nid @tcp nids. Thanks, Jason From: lustre-discuss-boun...@lists.lustre.org [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of vaibhav pol Sent: venerdì, 10. dicembre 2010 09:36 To: lustre-discuss@lists.lustre.org Su

Re: [Lustre-discuss] Metadata performance question

2010-10-05 Thread Temple Jason
I believe that was the *goal* of 2.0, but unfortunately, that lofty goal was not met, the timeline of which seemed to stretch from when Sun purchased Lustre until some time in the far future. See here for the features available in 2.0: http://wiki.lustre.org/index.php/Lustre_2.0_Features -Jaso

Re: [Lustre-discuss] How do you monitor your lustre?

2010-09-30 Thread Temple Jason
We use ganglia with collectl. These versions are the only ones I could find to work in this way: Sep 30 13:35 [r...@wn125:~]# rpm -qa |grep collectl collectl-3.4.2-5 Sep 30 13:35 [r...@wn125:~]# rpm -qa |grep ganglia ganglia-gmond-3.1.7-1 We are quite happy with it. Thanks, Jason -Origin

Re: [Lustre-discuss] kernel: BUG: soft lockup - CPU stuck for 10s! with lustre 1.8.4

2010-09-20 Thread Temple Jason
It appears that turning off statahead does indeed avoid the soft lockup bug. But this seems to me to be a workaround, and not a solution. Is statahead not useful for performance gains? I am not comfortable making my user's jobs waste more cpu time because I have to implement a workaround inst

[Lustre-discuss] kernel: BUG: soft lockup - CPU stuck for 10s! with lustre 1.8.4

2010-09-14 Thread Temple Jason
Hello, I have recently upgraded my lustre filesystem from 1.8.3 to 1.8.4. The first day we brought our system online with the new version, we started seeing clients getting stuck in this soft lockup loop. The load shoots up over 120, and eventually the node becomes unusable and requires a har