Re: [Lustre-discuss] New wc-discuss Lustre Mailing List

2011-07-02 Thread Robert Read
Hi,

We've received feedback that some people are unwilling to join the wc-discuss 
list because they believe it requires using a gmail mail address.  Luckily, 
this is not the case.  It is possible to subscribe to wc-discuss (or any google 
group for that matter) using a non-gmail address. The key step is to to create 
a google (not gmail) account using your own email address. There are create an 
account now links on google login pages, or this link should take you to the 
sign up screen where you can use an already existing account: 

https://www.google.com/accounts/SignUpWidget

Once created, your new account will work like any other web service account 
associated with your email address, and will allow you to sign up for google 
groups like wc-discuss.

cheers,
robert


On Jun 29, 2011, at 12:01 , Andreas Dilger wrote:

 Hi,
 I'd like to announce the creation of a new mailing list for discussing
 Lustre releases from Whamcloud.  We will also continue to monitor and
 participate on the existing lustre.org mailing lists, but we consider
 it prudent to host a separate list from lustre.org due to uncertainty
 regarding the long-term plans for lustre.org.
 
 Subscription information and archives are available at:
 
 https://groups.google.com/a/whamcloud.com/group/wc-discuss/
 
 
 Cheers, Andreas
 --
 Andreas Dilger 
 Principal Engineer
 Whamcloud, Inc.
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Does lustre 1.8 stop update and maintenance?

2011-03-01 Thread Robert Read
Hi,

I cannot comment about Oracle's plans regarding Lustre, but Whamcloud does 
intend to continue supporting 1.8.x for some time.  You can see activity 
related to 1.8.x (as well as 2.1) in http://jira.whamcloud.com. 

cheers,
robert read
Whamcloud, Inc



On Mar 1, 2011, at 4:48 , Larry wrote:

 Hi, all
 
 Does lustre 1.8 stop update and maintenance? I have not seen any
 updates for a long time. Only Whamcloud releases the Lustre 2.1. Does
 it mean Oracle freeze the development of lustre?
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre community build server

2010-12-16 Thread Robert Read
Hi Aurélien,

Yes, we've noticed Hudson's support for testing is not quite what we need, so 
we're planning to use Hudson to trigger our testing system, but not necessarily 
to manage it.  We'd definitely be interested in learning more about your 
experiences, though. 

robert




On Dec 16, 2010, at 1:22 , DEGREMONT Aurelien wrote:

 Hi Robert,
 
 That's very interesting.
 At CEA we also have a Hudson platform and I'm running acceptance-small for 
 several Lustre branches in it. Hudson is a great tool but it was not design 
 to test tools that run kernel-space that can crash your nodes or, at least, 
 put your kernel in a bad shape. I will be very interested to share Hudson 
 experience testing Lustre and see how you've configured it for your own tests.
 
 
 Aurélien
 
 Robert Read a écrit :
 Hi,
 
 As I mentioned the other day, Whamcloud is hosting a Hudson build server and 
 producing snapshot builds for Centos 5.x (and Ubuntu 10.4 when it works) for 
 both 1.8.x and 2.x branches. Our intention is for this to be a resource for 
 the Lustre community to find recent Lustre packages for variety of Linux 
 distributions. Early next year we'll connect this to our test system so at 
 least some of the packages can be tested, as well.
 
 We would be interested in hearing from anyone that would like to participate 
 producing builds. Hudson is an distributed system, and it's easy to add more 
 build nodes, even behind firewalls (some of us are running build VMs on our 
 home machines). If you would like add another distribution or architecture 
 we don't have yet, or even one we do have (the more the merrier), we'd be 
 happy to work with you to do that.  Please contact me if you are interested. 
  
 cheers,
 robert
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
  
 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre community build server

2010-12-16 Thread Robert Read
Hi, 

On Dec 16, 2010, at 9:42 , DEGREMONT Aurelien wrote:

 Hi
 
 Robert Read a écrit :
 Hi Aurélien,
 
 Yes, we've noticed Hudson's support for testing is not quite what we need, 
 so we're planning to use Hudson to trigger our testing system, but not 
 necessarily to manage it.  We'd definitely be interested in learning more 
 about your experiences, though.   
 I do not know what you mean by triggering your testing system. But here is 
 what I set up.

I mean that once the build is complete we will notify the test system that new 
build is ready to be picked up and tested.  We haven't yet implemented that 
part of it. 

 Hudson has only 1 slave node dedicated to testing Lustre 2.
 Hudson will launch a shell script through ssh to it.
 
 This script:
 - retrieves Lustre source (managed by Hudson git plugin)
 - compiles it.
 - launches acceptance-small with several parameters.
 - acceptance-small will connect to other nodes dedicated for these tests.
 
 acc-sm have been patched:
 - to be more error resilient (does not stop at first failure)
 - to generate a test report in JUNIT format.
 
 Hudson fetch the junit report and parse it thanks to its plugin.
 Hudson can display in its interface all tests successes and failures.
 
 Everything goes fine as long as:
 - the testsuite leaves the node in a good shape. It is difficult to have a 
 automatic way to put the node back. Currently, we need to manualy fix that.
 - Hudson does not know about the other nodes used by acc-sm. And so can 
 trigger tests even if some sattelites nodes are unavailable.
 
 How is you do this on your side?


We don't plan to use Hudson to manage our testing results as I don't think it 
would scale very well for all the testing we might do for each build. We're 
currently building a more custom results server that's similar (in spirit at 
least) to the kinds of tools we had at Oracle.  We'll make it available once 
it's in presentable form. 

Actually, our first step was to replace the acceptance-small.sh driver script 
with one that has a more sensible user interface for running the standard 
tests.  Since the test-framework.sh on master has already been changed to 
produce test results in yaml format,  the new script collects these with the 
logs, and is capable of submitting them to the test results server.   Currently 
this is being run manually, though.  Automating the test execution and 
connecting all the pieces will be next step. 

cheers,
robert

 
 
 Aurélien
 
 robert
 
 
 
 
 On Dec 16, 2010, at 1:22 , DEGREMONT Aurelien wrote:
 
  
 Hi Robert,
 
 That's very interesting.
 At CEA we also have a Hudson platform and I'm running acceptance-small for 
 several Lustre branches in it. Hudson is a great tool but it was not design 
 to test tools that run kernel-space that can crash your nodes or, at least, 
 put your kernel in a bad shape. I will be very interested to share Hudson 
 experience testing Lustre and see how you've configured it for your own 
 tests.
 
 
 Aurélien
 
 Robert Read a écrit :

 Hi,
 
 As I mentioned the other day, Whamcloud is hosting a Hudson build server 
 and producing snapshot builds for Centos 5.x (and Ubuntu 10.4 when it 
 works) for both 1.8.x and 2.x branches. Our intention is for this to be a 
 resource for the Lustre community to find recent Lustre packages for 
 variety of Linux distributions. Early next year we'll connect this to our 
 test system so at least some of the packages can be tested, as well.
 
 We would be interested in hearing from anyone that would like to 
 participate producing builds. Hudson is an distributed system, and it's 
 easy to add more build nodes, even behind firewalls (some of us are 
 running build VMs on our home machines). If you would like add another 
 distribution or architecture we don't have yet, or even one we do have 
 (the more the merrier), we'd be happy to work with you to do that.  Please 
 contact me if you are interested.  cheers,
 robert
 
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
   
 
  
 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre community build server

2010-12-15 Thread Robert Read
Hi,

As I mentioned the other day, Whamcloud is hosting a Hudson build server and 
producing snapshot builds for Centos 5.x (and Ubuntu 10.4 when it works) for 
both 1.8.x and 2.x branches. Our intention is for this to be a resource for the 
Lustre community to find recent Lustre packages for variety of Linux 
distributions. Early next year we'll connect this to our test system so at 
least some of the packages can be tested, as well.

We would be interested in hearing from anyone that would like to participate 
producing builds. Hudson is an distributed system, and it's easy to add more 
build nodes, even behind firewalls (some of us are running build VMs on our 
home machines). If you would like add another distribution or architecture we 
don't have yet, or even one we do have (the more the merrier), we'd be happy to 
work with you to do that.  Please contact me if you are interested.  

cheers,
robert




___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Multi-Role/Tasking MDS/OSS Hosts

2010-09-17 Thread Robert Read
Hi,

On Sep 17, 2010, at 14:49 , Bernd Schubert wrote:

 Hello Cory,
 
 On 09/17/2010 11:31 PM, Cory Spitz wrote:
 Hi, Bernd.
 
 On 09/17/2010 02:48 PM, Bernd Schubert wrote:
 On Friday, September 17, 2010, Andreas Dilger wrote:
 On 2010-09-17, at 12:42, Jonathan B. Horen wrote:
 We're trying to architect a Lustre setup for our group, and want to
 leverage our available resources. In doing so, we've come to consider
 multi-purposing several hosts, so that they'll function simultaneously
 as MDS  OSS.
 
 You can't do this and expect recovery to work in a robust manner.  The
 reason is that the MDS is a client of the OSS, and if they are both on the
 same node that crashes, the OSS will wait for the MDS client to
 reconnect and will time out recovery of the real clients.
 
 Well, that is some kind of design problem. Even on separate nodes it can 
 easily happen, that both MDS and OSS fail, for example power outage of the 
 storage rack. In my experience situations like that happen frequently...
 
 
 I think that just argues that the MDS should be on a separate UPS.
 
 well, there is not only a single reason. Next hardware issue is that
 maybe an IB switch fails. And then have also seen cascading Lustre
 failures. It starts with an LBUG on the OSS, which triggers another
 problem on the MDS...
 Also, for us this actually will become a real problem, which cannot be
 easily solved. So this issue will become a DDN priority.

There is always a possibility that multiple failures will occur, and this 
possibility can 
be reduced depending on one's resources. The point here is simply that  a 
configuration with an mds and oss  on the same node will guarantee multiple 
failures and aborted OSS recovery when that node fails.

cheers,
robert

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Multi-Role/Tasking MDS/OSS Hosts

2010-09-17 Thread Robert Read
Hi,

On Sep 17, 2010, at 12:48 , Bernd Schubert wrote:

 On Friday, September 17, 2010, Andreas Dilger wrote:
 On 2010-09-17, at 12:42, Jonathan B. Horen wrote:
 We're trying to architect a Lustre setup for our group, and want to
 leverage our available resources. In doing so, we've come to consider
 multi-purposing several hosts, so that they'll function simultaneously
 as MDS  OSS.
 
 You can't do this and expect recovery to work in a robust manner.  The
 reason is that the MDS is a client of the OSS, and if they are both on the
 same node that crashes, the OSS will wait for the MDS client to
 reconnect and will time out recovery of the real clients.
 
 Well, that is some kind of design problem. Even on separate nodes it can 
 easily happen, that both MDS and OSS fail, for example power outage of the 
 storage rack. In my experience situations like that happen frequently...
 
 I think some kind a pre-connection would be required, where a client can tell 
 a server, that it was rebooted and that the server shall not to wait any 
 longer for it. Actually, shouldn't be that difficult, as already different 
 connection flags exist. So if the client contacts a server and ask for an 
 initial connection, the server could check for that NID and then immediately 
 abort recovery for that client.

This is an interesting idea, but NID is not ideal as this wouldn't be compatible
with multiple mounts on the same node.  Not very useful in production, perhaps,
but very useful for testing.

Another option would be to hash the mount point  pathname (and some other data, 
such as the NID) and use this as the client uuid.  Then the client uuid would 
be persistent 
across reboots and the server would rely on flags to detect if this was a 
reconnect or a
new connection after a reboot or remount.

robert

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] kernel: BUG: soft lockup - CPU stuck for 10s! with lustre 1.8.4

2010-09-17 Thread Robert Read
Hi Peter,

Perhaps the link got mangled by your mail client? (It does have some 
seemingly unusual characters for an URL.)  My interpretation of Gabriele's 
reply is that the problem occurred even with statahead disabled, so in
that case this patch might be worth trying. 

robert




On Sep 17, 2010, at 10:18 , Peter Jones wrote:

 The URL does not work for me, but if it is a statahead issue then surely 
 turning statahead off would be a simple workaround to avoid having to 
 apply a patch.
 
 Fan Yong wrote:
  On 9/14/10 8:55 PM, Gabriele Paciucci wrote:
 
 I have the same problem, I put the statahead_max to 0 !!!
 
 In fact, I have made a patch for statahead related issues (including 
 this one) against lustre-1.8, which is in inspection.
 http://review.whamcloud.com/#change,2
 If possible, you can try such patch.
 
 Cheers,
 --
 Nasf
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Local Northern California Lustre Users Group

2010-08-09 Thread Robert Read
Yes, I remember discussing this back at the LUG too, and it sounds good to me. 
I agree that an evening meeting would probably be better in general.  

robert

On Aug 4, 2010, at 10:06 AM, D. Marc Stearman wrote:

 I think I was one of those suggesting the same thing.  I think a lunch  
 time meeting is fine for folks on the Peninsula, or in the South Bay,  
 but for those of us in the far East Bay it could take an hour to get  
 there.  I was thinking a monthly meeting sometime in the evening for  
 dinner/drinks would work well.
 
 -Marc
 
 
 D. Marc Stearman
 Lustre Operations Lead
 m...@llnl.gov
 925.423.9670
 Pager: 1.888.203.0641
 
 
 
 
 On Jun 11, 2010, at 3:54 PM, Sebastian Gutierrez wrote:
 
 Hello,
 
 While at LUG 2010, a few of us mentioned that we would be interested  
 in a local LUG (LLUG?) in the bay area.  We discussed that it may be  
 a good idea to meet up about once a month to discuss issues and  
 solutions.  If you are still interested I am able to provide the  
 initial meeting spot for our meeting.
 
 I am offering Stanford as our first meeting spot.  If we can settle  
 on a date we can start meeting either at the end of the month or  
 early next month.  I was thinking we could meet on the second Monday  
 or Wednesday of each month, during lunch.
 
 What I need: a date that a majority could agree upon, or I can  
 choose a date. Once we decide on a date I will set up a RSVP due date.
 
 My rough outline of the time would be:
 
 1. Current updates, if any
 2. Someone present an issue
 3. We discuss solutions
 4. goto 2
 
 Cheers,
 Sebastian
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://*lists.lustre.org/mailman/listinfo/lustre-discuss
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre-1.9.210: mkfs.ldiskfs?

2009-07-01 Thread Robert Read

On Jul 1, 2009, at 10:03 , Josephine Palencia wrote:


 OS: Centos 5.3, x86_64
 Kernel: 2.6.18-128.1.6

 [r...@attractor ~]# cat /proc/fs/lustre/version
 lustre: 1.9.210
 ..

 Looking for mkfs.ldiskfs during mkfs.lustre?
 Did a find on previous working lustre-1.9 working and didn't find one.
 Ps advice.



What configure options did you use when you built lustre?  I think -- 
with-ldiskfsprogs will configure lustre to use these names, however  
the standard version the e2fs utils doesn't use this naming scheme.   
This would be used, for instance, if you needed to have both the  
original and ldiskfs version of e2fsutils installed at the same time.

robert


 [r...@attractor ~]# mkfs.lustre --fsname=jwan0 --mdt --mgs /dev/hdc1

Permanent disk data:
 Target: jwan0-MDT
 Index:  unassigned
 Lustre FS:  jwan0
 Mount type: ldiskfs
 Flags:  0x75
   (MDT MGS needs_index first_time update )
 Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
 Parameters:

 checking for existing Lustre data: not found
 device size = 19085MB
 2 6 18
 formatting backing filesystem ldiskfs on /dev/hdc1
 target name  jwan0-MDT
 4k blocks 0
 options-J size=400 -i 4096 -I 512 -q -O
 dir_index,uninit_groups -F
 mkfs_cmd = mkfs.ldiskfs -j -b 4096 -L jwan0-MDT  -J size=400 -i  
 4096
 -I 512 -q -O dir_index,uninit_groups -F /dev/hdc1
sh: mkfs.ldiskfs: command not found


 mkfs.lustre FATAL: Unable to build fs /dev/hdc1 (32512)

 mkfs.lustre FATAL: mkfs failed 32512


 Thanks,
 josephin
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre and Xen

2008-10-06 Thread Robert Read

On Oct 4, 2008, at 08:33 , Daniel Ferber wrote:



I have two questions:

1. On the OSS/OST side, can one run Lustre as a server on a system  
running a Xen kernel?


2. Conversely: is the lustre client compatible with Xen kernel  
machines?



The answer to both questions is yes - I routinely test Lustre  servers  
and clients on Xen using a RHEL 5 kernel with our patches and  
configured as a Xen guest. I have not tried a patchless client on a  
RHEL 5 Xen kernel yet, but I would expect that to work if you built  
our modules for it.


robert






Thanks much,
Dan


--
Dan Ferber, [EMAIL PROTECTED]
+1 612-486-5167 (Office)
+1 651-356-9481 (Cell
x29704 (Sun Internal)


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre and Xen

2008-10-06 Thread Robert Read


On Oct 6, 2008, at 16:03 , Andreas Dilger wrote:

 On Oct 04, 2008  10:33 -0500, Daniel Ferber wrote:
 I have two questions:

 1. On the OSS/OST side, can one run Lustre as a server on a system  
 running a
 Xen kernel?

 2. Conversely: is the lustre client compatible with Xen kernel  
 machines?

 Note that work and perform well are two completely different  
 issues.
 While CPU virtualization is quite efficient these days, IO  
 virtualization
 is not necessarily very fast at all.




I haven't benchmarked IO on Xen in a long time, but I'd expect a  
pretty reasonable percentage of native IO at this point. Not quite as  
efficient as cpu virtualization, of course, but I wouldn't necessarily  
say not very fast either.

robert

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss