[Lustre-discuss] Fwd: Annual Lustre Community Survey

2013-04-10 Thread Nathan Rutman
Forwarding for Jeff:

 
 
 
 From: Jeff Denworth jdenwo...@ddn.com
 Date: Thursday, April 4, 2013 5:54 PM
 To: lustre-commun...@lists.lustre.org lustre-commun...@lists.lustre.org, 
 lustre-discuss@lists.lustre.org lustre-discuss@lists.lustre.org
 Subject: Annual Lustre Community Survey
 
 All,

 The past year has seen many significant milestones in Lustre development, 
 community participation and user adoption.  Some very exciting work has been 
 done to enable performance at radically new levels of scale; usability  
 administrative features also have been added to enable increased technology 
 adoption.
 
 This is a perfect time for us all, as a community, to take stock of where we 
 are and what comes next.  Please take 2 minutes to respond to this anonymous 
 8 question, multiple choice survey.   Your answers will help inform the 
 community discussion around the Lustre ecosystem, its evolution and its 
 future.  A summary of your responses will be presented live in 2 weeks at 
 the Lustre User Group (LUG’13).  YOUR contributions to this survey can help 
 ensure that the survey results reflect the needs, wants and desires of the 
 worldwide community. 
  
 Thank you in advance for a minute of your time,
  
 The DDN Team 
 Survey URL:  https://www.surveymonkey.com/s/LustreFS

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] MD benchmarking with mdtest

2013-03-06 Thread Nathan Rutman
A request on a call today prodded me to publish our general-purpose Lustre 
mdtest benchmark plan, the metadata counterpart to the IOR plan I had published 
before.   Hope you may find it useful.

Lustre MD Benchmark Methodology using mdtest
http://goo.gl/UBs1p

Lustre IO Benchmark Methodology using IOR
http://goo.gl/7AWwQ

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] [wc-discuss] New Test Framework Development - Requirements Capture

2012-12-18 Thread Nathan Rutman
Renaming Chris' page to "Test Framework Requiremens", I have added an upper-level pagehttp://wiki.opensfs.org/Improving_the_test_frameworkthat references the three areas of focus we identified on our last phonecall:	• Development of a completelynew test frameworkthat will provide the capabilities to test Lustre at exascale.	• Incremental benefits byimproving current test framework.	• Providing better coverage byimproving individual tests.and links to the requirements page.On Dec 1, 2012, at 9:55 AM, "Gearing, Chris" chris.gear...@intel.com wrote:Hi,During the last meeting we decided to setup a Wiki page to allow everyone to capture their thoughts, requirements and ideas for possible inclusion into the new framework environment.This page is now available on the OpenSFS Wiki and we would welcome your input before the 7th December, please provide input how big or small.http://wiki.opensfs.org/New_test_frameworkMany thanksChris-Intel Corporation (UK) LimitedRegistered No. 1134945 (England)Registered Office: Pipers Way, Swindon SN3 1RJVAT No: 860 2173 47This e-mail and any attachments may contain confidential material forthe sole use of the intended recipient(s). Any review or distributionby others is strictly prohibited. If you are not the intendedrecipient, please contact the sender and delete all copies.attachment: winmail.dat___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] [wc-discuss] Seeking contributors for Lustre User Manual

2012-11-13 Thread Nathan Rutman
Would it be easier to move the manual back to a Wiki?  The low hassle factor of 
wikis has always been a draw for contribution.  The openSFS site is up and 
running with MediaWiki now (wiki.opensfs.org).


On Nov 9, 2012, at 4:09 PM, Dilger, Andreas andreas.dil...@intel.com wrote:

 In hopes of improving the quality, coverage, and currency of the Lustre
 User Manual, I'm putting out a call to the Lustre community for
 contributors to this important resource.
 
 The user manual is especially important for new users to Lustre, but has
 fallen into some disrepair now that there is no longer a dedicated
 documentation writer for it.  As well, there are many sections of the
 manual that have become outdated over the years (such as example output,
 command descriptions, etc) that need to be refreshed.
 
 The Lustre User Manual is also a component of Lustre that has a much
 larger pool of potential contributors than the code itself.  If you want
 to contribute to Lustre, but are not able to contribute with patches to
 the code, then this is a great opportunity to help out.  If you benefit
 from the open source development of Lustre, then contributing to the
 manual is a chance to return something back to the community.  The Lustre
 Manual is released under a Creative Commons license, so it is open to all
 of us to improve.
 
 
 While there is not currently a todo list for the areas of the manual
 that need updating, looking through open LUDOC tickets is one option:
 
 
 http://bugs.whamcloud.com/secure/QuickSearch.jspa?searchString=LUDOC%20open
 
 There are a number of existing documentation tickets for features that are
 under development for the Lustre 2.4 release (for which we expect to be
 completed internally), but there are also some tickets from users pointing
 out errors in the document that need to be fixed.
 
 Another way to improve the manual is to simply fetch the manual and read
 some section at random that you are either interested in, or have some
 knowledge about, and see if any of the text is confusing, outdated, or
 incorrect and needs to be updated.
 
 
 The PDF and HTML versions of the current manual are available at:
 
http://wiki.whamcloud.com/display/PUB/Documentation
 
 The manual source is hosted in a Git/Gerrit repository in Docbook XML
 format and can be downloaded at:
 
git clone http://git.whamcloud.com/doc/manual lustre-manual
 
 An account in Gerrit is required to download the manual and submit patches.
 
 There are some wiki pages that describe the process and packages needed to
 modify, build, and submit patches to the manual:
 
 
 http://wiki.whamcloud.com/display/PUB/Making+changes+to+the+Lustre+Manual
 
 Cheers, Andreas
 --
 Andreas Dilger
 Lustre Software Architect
 Intel Corporation
 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] [wc-discuss] Remove a inactive OST

2012-11-05 Thread Nathan Rutman
I suspect if you restart the clients the problem will go away.

On Oct 30, 2012, at 1:49 AM, Alfonso Pardo alfonso.pa...@ciemat.es wrote:

 More about my problem:
 
 If I run in the client the command lfs df -i. I can see the inactive/removed 
 OST:
 
 UUID  Inodes   IUsed   IFree IUse% Mounted on
 cetafs-MDT_UUID97547059223375132   952095460   2% /mnt/data[MDT:0]
 cetafs-OST_UUID 1907328018889414  183866  99% /mnt/data[OST:0]
 cetafs-OST0001_UUID 1907328018889304  183976  99% /mnt/data[OST:1]
 cetafs-OST0002_UUID 1907328018889353  183927  99% /mnt/data[OST:2]
 cetafs-OST0003_UUID 1907328018889397  183883  99% /mnt/data[OST:3]
 cetafs-OST0004_UUID 1907328018889372  183908  99% /mnt/data[OST:4]
 cetafs-OST0005_UUID 1907328018889440  183840  99% /mnt/data[OST:5]
 cetafs-OST0006_UUID 1907328018889184  184096  99% /mnt/data[OST:6]
 cetafs-OST0007_UUID 1907328018889292  183988  99% /mnt/data[OST:7]
 cetafs-OST0008_UUID 1907328018889134  184146  99% /mnt/data[OST:8]
 cetafs-OST0009_UUID 1907328018889413  183867  99% /mnt/data[OST:9]
 cetafs-OST000a_UUID 190732801999  184281  99% 
 /mnt/data[OST:10]
 cetafs-OST000b_UUID 1907328018889393  183887  99% 
 /mnt/data[OST:11]
 cetafs-OST000c_UUID 1907328018889290  183990  99% 
 /mnt/data[OST:12]
 cetafs-OST000d_UUID 1907328018889353  183927  99% 
 /mnt/data[OST:13]
 cetafs-OST000e_UUID 1907328018889349  183931  99% 
 /mnt/data[OST:14]
 cetafs-OST000f_UUID 1907328018889357  183923  99% 
 /mnt/data[OST:15]
 cetafs-OST0010_UUID 1907328018889378  183902  99% 
 /mnt/data[OST:16]
 cetafs-OST0011_UUID 1907328018889385  183895  99% 
 /mnt/data[OST:17]
 cetafs-OST0012_UUID 19073280 262901416444266  13% 
 /mnt/data[OST:18]
 cetafs-OST0013_UUID 19073280 262904516444235  13% 
 /mnt/data[OST:19]
 OST0014 : Resource temporarily unavailable
 cetafs-OST0015_UUID  7621120 1494736 6126384  19% 
 /mnt/data[OST:21]
 cetafs-OST0016_UUID  7621120 1495107 6126013  19% 
 /mnt/data[OST:22]
 cetafs-OST0017_UUID  7621120 1494952 6126168  19% 
 /mnt/data[OST:23]
 cetafs-OST0018_UUID  7621120 1494865 6126255  19% 
 /mnt/data[OST:24]
 
 filesystem summary:97547059223375132   952095460   2% /mnt/data
 
 But, if I list the lustre devices in the client with lctl dl, the 
 inactive/removed OST don't exist:
 
   0 UP mgc MGC192.168.11.9@tcp 7ac5cb56-40f1-7183-672b-8d77a7f42d5d 5
   1 UP lov cetafs-clilov-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 4
   2 UP mdc cetafs-MDT-mdc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
   3 UP osc cetafs-OST-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
   4 UP osc cetafs-OST0001-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
   5 UP osc cetafs-OST0002-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
   6 UP osc cetafs-OST0003-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
   7 UP osc cetafs-OST0004-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
   8 UP osc cetafs-OST0005-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
   9 UP osc cetafs-OST0006-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  10 UP osc cetafs-OST0007-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  11 UP osc cetafs-OST0012-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  12 UP osc cetafs-OST0013-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  13 UP osc cetafs-OST0008-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  14 UP osc cetafs-OST000a-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  15 UP osc cetafs-OST0009-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  16 UP osc cetafs-OST000b-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  17 UP osc cetafs-OST000c-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  18 UP osc cetafs-OST000d-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  19 UP osc cetafs-OST000e-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  20 UP osc cetafs-OST000f-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  21 UP osc cetafs-OST0010-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  22 UP osc cetafs-OST0011-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  23 UP osc cetafs-OST0018-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  24 UP osc cetafs-OST0015-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  25 UP osc cetafs-OST0016-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
  26 UP osc cetafs-OST0017-osc-81012394e400 
 132759a4-add7-eaed-3b81-d27b42f97aef 5
 
 
 It is 

Re: [Lustre-discuss] [wc-discuss] joining two differents mgs in one

2012-07-03 Thread Nathan Rutman
Two ways I can think of:
1. mount the MGS disks as ldiskfs and copy the config files from the old MGS to 
the new, and point all the servers to use the new MGS ipaddr with tunefs
2. writeconf all the servers, setting mgsnode as the new one.  This will 
regenerate all the config files on the new MGS.

On Jun 28, 2012, at 9:12 PM, Philippe Weill wrote:

 Hello
 
 for moment and for historic reason we have four lustre filesystem on two 
 different mgs-mds
 each mgs-mds having two filesystems ( 1.8.7wc1 )
 
 we want to change our mgs-mds infrastructure ( new hardware )
 and keeping only one mgs and mds server for all the filesystem
 
 I know how to move mdt on new hardware and changing IP
 but how can we join the two distinct mgs device in one
 
 Thanks in advance
 -- 
 Weill Philippe -  Administrateur Systeme et Reseaux
 CNRS/UPMC/IPSL   LATMOS (UMR 8190)
 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] [wc-discuss] Lustre 2.2 production experience

2012-06-15 Thread Nathan Rutman
I wasn't complaining, just asking ;)




On Jun 14, 2012, at 6:27 PM, Andreas Dilger adil...@whamcloud.com wrote:

 I think the stability of 2.2.0 is comparable to 2.1.0.
 
 One issue is about the number of separate maintenance releases that can be 
 tested. If there are many maintenance releases, then each of those branches 
 would get correspondingly less testing time before release.
 
 Secondly, there is a limit on the amount of time that can be spent on porting 
 patches to each maintenance release.
 
 This system of maintenance vs. feature releases is similar to what is done 
 for Ubuntu Long Term Stability (LTS) regular releases, and Fedora vs. RHEL. 
 While there is a desire to make each release as reliable as possible, the 
 resources needed to maintain all of the releases for a long time would be 
 very high.  
 
 Cheers, Andreas
 
 On 2012-06-14, at 17:48, Nathan Rutman nathan_rut...@xyratex.com wrote:
 
 Is there a belief that Lustre 2.2 is any less stable than Lustre 2.1.0?  
 IOW, are the new features introduced in 2.2 believed to introduce more risk?
 
 On Jun 9, 2012, at 3:20 PM, Andreas Dilger wrote:
 
 I guess the new Lustre release process is similar to how Ubuntu is 
 released. While we do our best to make each release as stable as possible, 
 there is a different expectation for long-term updates of the feature 
 releases and the maintenance releases. 
 
 Cheers, Andreas
 
 On 2012-06-09, at 16:05, Wojciech Turek wj...@cam.ac.uk wrote:
 
 Thanks for a quick reply Andreas. I slightly misunderstood the lustre
 release process and thought that the next stable/production version is
 2.2
 
 I am then interested in the experience of people running Lustre 2.1
 
 Cheers
 
 Wojciech
 
 On 9 June 2012 21:52, Andreas Dilger adil...@whamcloud.com wrote:
 I think you'll find that there are not yet (m)any production deployments 
 of 2.2. There are a number of production 2.1 deployments, and this is the 
 current maintenance stream from Whamcloud.
 
 Cheers, Andreas
 
 On 2012-06-09, at 14:33, Wojciech Turek wj...@cam.ac.uk wrote:
 
 I am building a 1.5PB storage system which will employ Lustre as the
 main file system. The storage system will be extended at the later
 stage beyond 2PB.  I am considering using Lustre 2.2 for production
 environment. This Lustre storage system will replace our older 300TB
 system which is currently running Lustre 1.8.8. I am quite happy with
 lustre 1.8.8 however for the new system Lustre 2.2 seem to be a better
 match.  The storage system will be attached to a university wide
 cluster (800 nodes), hence there will be quite a large range of
 applications using the filesystem. Could people with production
 deployments of Lustre 2.2 share their experience please?
 
 
 --
 Wojciech Turek
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Metadata storage in test script files

2012-05-07 Thread Nathan Rutman

On May 4, 2012, at 7:46 AM, Chris Gearing wrote:

 Hi Roman,
 
 I think we may have rat-holed here and perhaps it's worth just 
 re-stating what I'm trying to achieve here.
 
 We have a need to be able to test in a more directed and targeted 
 manner, to be able to focus on a unit of code like lnet or an attribute 
 of capability like performance. However since starting work on the 
 Lustre test infrastructure it has become clear to me that knowledge 
 about the capability, functionality and purpose of individual tests is 
 very general and held in the heads of Lustre engineers. Because we are 
 talking about targeting tests we require knowledge about the capability, 
 functionality and purpose of the tests not the outcome of running the 
 tests, or to put it another way what the tests can do not what they have 
 done.
 
 One key fact about cataloguing the the capabilities of the tests is that 
 for almost every imaginable case the capability of the test only changes 
 if the test itself changes and so the rate of change of the data in the 
 catalogue is the same and actually much less than the rate of change 
 test code itself. The only exception to this could be that a test 
 suddenly discovers a new bug which has to have a new ticket attached to 
 it, although this should be a very very rare if we manage our 
 development process properly.
 
 This requirement leads to the conclusion that we need to catalogue all 
 of the tests within the current test-framework and a catalogue equates 
 to a database, hence we need a database of the capability, functionality 
 and purpose of the individual tests. With this requirement in mind it 
 would be easy to create a database using something like mysql that could 
 be used by applications like the Lustre test system, but using an 
 approach like that would make the database very difficult to share and 
 will be even harder to attach the knowledge to the Lustre tree which is 
 were it belongs.
 
 So the question I want to solve is how to catalogue the capabilities of 
 the individual tests in a database, store that data as part of the 
 Lustre source and as a bonus make the data readable and even carefully 
 editable by people as well as machines. Now to focus on the last point I 
 do not think we should constrain ourselves to something that can be read 
 by machine using just bash, we do have access to structure languages and 
 should make use of that fact.
 
I think we all agree 100% on the above...

 The solution to all of this seemed to be to store the catalogue about 
 the tests as part of the tests themselves
... but not necessarily that conclusion.

 , this provides for human and 
 machine accessibility, implicit version control and certainty the what 
 ever happens to Lustre source the data goes with it. It is also the case 
 that by keeping the catalogue with the subject the maintenance of the 
 catalogue is more likely to occur than if the two are separate.

I agree with all those.  But there are some difficulties with this as well:
1. bash isn't a great language to encapsulate this metadata
2. this further locks us in to current test implementation - there's not much 
possibility to start writing tests in another language if we're parsing through 
looking for bash-formatted metadata. Sure, multiple parsers could be written...
3. difficulty changing md of groups of tests en-mass - eg. add slow keyword 
to a set of tests
4. no inheritance of characteristics - each test must explicitly list every 
piece of md.  This not only blows up the amount of md it also is a source for 
typos, etc. to cause problems.
5. no automatic modification of characteristics.  In particular, one piece of 
md I would like to see is maximum allowed test time for each test.  Ideally, 
this could be measured and adjusted automatically based on historical and 
ongoing run data.  But it would be dangerous to allow automatic modification to 
the script itself.

To address those problems, I think a database-type approach is exactly right, 
or perhaps a YAML file with hierarchical inheritance.
To some degree, this is a evolution vs revolution question, and I prefer to 
come down on the revolution-enabling design, despite the problems you list.  
Basically, I believe the separated MD model allows for the replacement of 
test-framework, and this, to my mind, is the majority driver for adding the MD 
at all.


 
 My original use of the term test metadata is intended as a more modern 
 term for catalogue or the [test] library.
 
 So to refresh everybody's mind, I'd like to suggest that we place test 
 metadata in the source code itself using the following format, where the 
 here doc is inserted into the copy about the test function itself.
 
 ===
 TEST_METADATA
 Name:
   before_upgrade_create_data
 Summary:
   Copies lustre source into a node specific directory and then creates 
 a tarball using that directory
 Description:

Re: [Lustre-discuss] [wc-discuss] Re: Lustre and cross-platform portability

2012-03-21 Thread Nathan Rutman
We had a thought that a FUSE-based client based on liblustre might make sense 
to avoid the NFS/Samba re-export problems (scalability, coherency), at the 
potential price of some performance.  You mentioned on the phone this morning 
that someone had already done something like this in the past?  If we revive 
this work, presumably we can drop the Mac native client.  
On the other hand, I agree that the baggage is large, and it will take much 
more work to see this through, and without a real champion and/or a real 
community need, I understand the desire to drop it.

On Mar 15, 2012, at 11:38 AM, Andreas Dilger wrote:

 On 2012-03-15, at 8:22 AM, Tyler Hawes wrote:
 Having a native Windows  Mac client is by far our company's #1 most 
 important feature for the future. We have been seriously considering how we 
 could help make this happen, including putting funds and/or developers 
 toward the cause.
 
 Tyler, thanks for your feedback.
 
 While at Oracle, there was serious effort put toward having a Windows Native 
 Client.  Unfortunately, this died along with other Oracle Lustre development 
 projects, and the proprietary code has been sitting unused inside Oracle 
 since then.
 
 Similarly, the MacOS native client project started a couple of years ago, but 
 as yet none of the code has been released.  While some work was done for a 
 Solaris server, I don't see much hope in that making progress at all.
 
 In both cases, I would be supportive of these efforts if there was a 
 likelihood of them bearing fruit any time in the future, but so far I haven't 
 seen any progress in that direction.  If Oracle could somehow release the WNC 
 code and/or NRL can overcome the government roadblocks in their path for 
 releasing the MacOS client (MLC?), it would definitely change the direction 
 of my thinking.
 
 As it stands now, we have portability code for Windows/Mac without any 
 ability to even build the code, let alone test it, so it is just a burden for 
 Linux.  If there was real value being derived from that code (e.g. actual 
 users), then it might be an acceptable burden.
 
 I know we're in the minority on this, but I believe it's because we are not 
 using Lustre for HPC. We use it for post-production of TV/film. There are a 
 few others companies in our industry who have started doing this as well. My 
 point is that, while Lustre currently is focused on the HPC crowd who seem 
 not to care about Windows/Mac, Lustre's maturity is giving it the potential 
 to grow into other uses besides HPC. I wouldn't call them general use, but 
 other high-performance uses. In our industry, where there are a lot of 
 Windows  Mac workstations that we want to connect to the Lustre storage, 
 the Linux-only client is a major obstacle to that. I'm sure there are other 
 industries that would benefit from this. 
 
 Yes, the TV/film industry was one of the prime motivators for WNC.  There 
 have always been a small number of users from this market, and growth in this 
 industry (and others) is definitely welcomed.
 
 If the community wants to keep Lustre strictly HPC focused and discourage 
 other industries from joining in, then abandoning the bridge (albeit the 
 half-built bridge that it is) to Linux/Windows is a good way to do that. If, 
 on the other hand, there is a desire to get some other industries involved, 
 perhaps with more resources and contribution coming from them, then I think 
 it's important to build on the work that has been done. In that regard, 
 upstream Linux kernel inclusion seems like a very low priority to me.
 
 I think even for upstream kernel submission, if it were clear that the 
 layering in Lustre was for a valid purpose (i.e. existing Mac/Win clients) 
 then it might be accepted.  As it stands now, we can't honestly make that 
 argument to the kernel maintainers.
 
 2012/3/15 lustre-discuss-requ...@lists.lustre.org
 -- Forwarded message --
 From: Andreas Dilger adil...@whamcloud.com
 To: t...@lists.opensfs.org
 Cc: wc-discuss wc-disc...@whamcloud.com, lustre-discuss discuss 
 lustre-discuss@lists.lustre.org, Lustre Devel 
 lustre-de...@lists.lustre.org
 Date: Wed, 14 Mar 2012 18:31:29 -0600
 Subject: [Lustre-discuss] Lustre and cross-platform portability
 Whamcloud and EMC are jointly investigating how to be able to contribute the 
 Lustre client code into the upstream Linux kernel.
 
 As a prerequisite to this, EMC is working to clean up the Lustre client code 
 to better match the kernel coding style, and one of the anticipated major 
 obstacles to upstream kernel submission is the heavy use of code abstraction 
 via libcfs for portability to other operating systems (most notably MacOS 
 and WinNT, but also for liblustre, and potentially *BSD).
 
 I have no information that the WinNT project will ever be released by 
 Oracle, and as yet there has not been any code released from the MacOS port, 
 so the libcfs portability layer is potentially exacting a high cost in code 

Re: [Lustre-discuss] [Lustre-devel] [wc-discuss] Re: [Twg] Lustre and cross-platform portability

2012-03-21 Thread Nathan Rutman

On Mar 16, 2012, at 8:06 AM, Todd, Allen wrote:

 
 On Friday, March 16, 2012 6:11 AM, Gregory Matthews wrote:
 why bother having a windows client if you lose the performance? We have
 windows based detectors and proprietary windows based analysis software
 that would definitely benefit from higher performance access to lustre
 file systems but replacing existing CIFS servers for no gain seems a bit
 pointless.

1. Re-exporting Lustre via CIFS or NFS isn't scalable to very large numbers of 
Windows clients
2. Re-exporting Lustre via CIFS or NFS can have coherency problems when 
multiple re-exporters are involved
3. Wide-striped file access would likely have greater performance on a native 
client than via a re-exported single pipe, for some value of wide.

 
 From the perspective of my firm, the benefit of a windows lustre client 
 (native or fuse-based), even one that performs at 10% or 20% of the linux 
 client, is the improved scalability that it offers.  Our current solution 
 uses 4 samba gateways per hundred windows servers to achieve acceptable 
 bandwidth, but that bandwidth can be cut to an unacceptable trickle if a 
 large wave of native linux clients simultaneously accesses the filesystem.
 
 We have repeatedly asked our Microsoft HPC contacts to intervene to either 
 fund a native windows client or to get oracle to release the existing one, 
 since scalable storage is a big hole in the Microsoft HPC toolkit.  
 Obviously, nothing has come out of that.  It seems Microsoft is banking on 
 pNFS, eventually working in this space.
 
 Allen Todd
 
 
 
 
 
 
 IMPORTANT: The information contained in this email and/or its attachments is 
 confidential. If you are not the intended recipient, please notify the sender 
 immediately by reply and immediately delete this message and all its 
 attachments. Any review, use, reproduction, disclosure or dissemination of 
 this message or any attachment by an unintended recipient is strictly 
 prohibited. Neither this message nor any attachment is intended as or should 
 be construed as an offer, solicitation or recommendation to buy or sell any 
 security or other financial instrument. Neither the sender, his or her 
 employer nor any of their respective affiliates makes any warranties as to 
 the completeness or accuracy of any of the information contained herein or 
 that this message or any of its attachments is free of viruses.
 ___
 Lustre-devel mailing list
 lustre-de...@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-devel
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Test restructuring

2011-10-07 Thread Nathan Rutman
Hi Chris - 
talking to Eric after EOFS he mentioned that you were not just interested in 
automating the existing Lustre tests but also in restructuring how they were 
designed / set up to make them more reliable, repeatable, and easier to run and 
automate.  Or, at least that is my desire and I was hoping you had similar 
thoughts / plans.  
In particular, I'm interested in a few things:
1. Separating out the setup of the filesystem from the execution of the tests.
2. Reducing the ordering dependencies on the tests and clarifying their 
individual prerequisites and postconditions.  
3. Reorganizing the tests into more logical groupings.
4. Cutting out redundant / ineffective tests.

Are you / WhamCloud pursuing any of this at the moment?
__
This email may contain privileged or confidential information, which should 
only be used for the purpose for which it was sent by Xyratex. No further 
rights or licenses are granted to use such information. If you are not the 
intended recipient of this message, please notify the sender by return and 
delete it. You may not use, copy, disclose or rely on the information contained 
in it.
 
Internet email is susceptible to data corruption, interception and unauthorised 
amendment for which Xyratex does not accept liability. While we have taken 
reasonable precautions to ensure that this email is free of viruses, Xyratex 
does not accept liability for the presence of any computer viruses in this 
email, nor for any losses caused as a result of viruses.
 
Xyratex Technology Limited (03134912), Registered in England  Wales, 
Registered Office, Langstone Road, Havant, Hampshire, PO9 1SA.
 
The Xyratex group of companies also includes, Xyratex Ltd, registered in 
Bermuda, Xyratex International Inc, registered in California, Xyratex 
(Malaysia) Sdn Bhd registered in Malaysia, Xyratex Technology (Wuxi) Co Ltd 
registered in The People's Republic of China and Xyratex Japan Limited 
registered in Japan.
__
 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] [Lustre-devel] Queries regarding Lustre Throughput Numbers with mdtest benchmark

2011-07-08 Thread Nathan Rutman
I suspect that you are running this test against local hard drives instead of a 
shared Lustre mount point.
Are you sure Lustre is mounted at /tmp/l66 on all clients in hostfile?


On Jul 6, 2011, at 7:25 PM, Andreas Dilger wrote:

 On 2011-06-22, at 5:06 PM, vilobh meshram wrote:
 I have a query regarding Lustre Throughput Numbers with mdtest benchmark.I 
 am running mdtest benhmark with following options :-
 
 /home/meshram/mpich2-new/mpich2-1.4/mpich2-install/bin/mpirun -np 256 
 -hostfile ./hostfile ./mdtest -z 3 -b 10 -I 5 -v -d /tmp/l66
 
 
 where ,
 mdtest - is the standard benchmark to test metadata operations. 
 [https://computing.llnl.gov/?set=codepage=sio_downloads ]
 /tmp/l66 is my Lustre mount.
 I am using 1Gige Network with TCP transport.
 hostfile has 8 host nodes
 I am varying the number of processes as we can see in following table .. 
 I was amazed by the throughput which I got. I think this is too huge. Can 
 someone please let me know if these numbers are correct ?
 
 
 
 4 Process  : File creation :  0.501 sec,  44392.140 ops/sec
 8 Process  : File creation :  0.685 sec,  64890.598 ops/sec
 16 Process: File creation :  1.426 sec,  62318.798 ops/sec
 32 Process: File creation :  2.947 sec,  60312.766 ops/sec
 64 Process: File creation :  5.630 sec,  63142.760 ops/sec
 128 Process  : File creation : 13.208 sec,  53835.707 ops/sec
 256 Process  : File creation : 24.601 sec,  57804.777 ops/sec
 
 Seems nobody has responded to your email, and I just found it buried in my
 inbox.
 
 I agree that the numbers are quite high, especially for GigE networking.
 More typical numbers are in the 5-20k creates/sec (depending on network
 and MDS hardware).
 
 That said, the above may not be completely impossible for small file
 counts (allowing all of the creates to be served from cache), or if the
 client+MDS+OSS are all on the same node (which avoids any network latency).
 
 You also didn't describe the Lustre filesystem, nor what version you are
 testing.  This workload has gotten a faster with Lustre 2.1, and it stands
 to get faster in Lustre 2.2 also.
 
 Cheers, Andreas
 --
 Andreas Dilger 
 Principal Engineer
 Whamcloud, Inc.
 
 
 
 ___
 Lustre-devel mailing list
 lustre-de...@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-devel
__
This email may contain privileged or confidential information, which should 
only be used for the purpose for which it was sent by Xyratex. No further 
rights or licenses are granted to use such information. If you are not the 
intended recipient of this message, please notify the sender by return and 
delete it. You may not use, copy, disclose or rely on the information contained 
in it.
 
Internet email is susceptible to data corruption, interception and unauthorised 
amendment for which Xyratex does not accept liability. While we have taken 
reasonable precautions to ensure that this email is free of viruses, Xyratex 
does not accept liability for the presence of any computer viruses in this 
email, nor for any losses caused as a result of viruses.
 
Xyratex Technology Limited (03134912), Registered in England  Wales, 
Registered Office, Langstone Road, Havant, Hampshire, PO9 1SA.
 
The Xyratex group of companies also includes, Xyratex Ltd, registered in 
Bermuda, Xyratex International Inc, registered in California, Xyratex 
(Malaysia) Sdn Bhd registered in Malaysia, Xyratex Technology (Wuxi) Co Ltd 
registered in The People's Republic of China and Xyratex Japan Limited 
registered in Japan.
__
 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Too many client eviction

2011-05-03 Thread Nathan Rutman

On May 3, 2011, at 10:09 AM, DEGREMONT Aurelien wrote:

 Correct me if I'm wrong, but when I'm looking at Lustre manual, it said 
 that client is adapting its timeout, but not the server. I'm understood 
 that server-client RPC still use the old mechanism, especially for our 
 case where it seems server is revoking a client lock (ldlm_timeout is 
 used for that?) and client did not respond.

Server and client cooperate together for the adaptive timeouts.  I don't 
remember which bug the ORNL settings were in, maybe 14071, bugzilla's not 
responding at the moment.  But a big question here is why 25315 seconds for a 
callback - that's well beyond anything at_max should allow...
 

 
 I forgot to say that we have LNET routers also involved for some cases.
 
 Thank you
 
 Aurélien
 
 Andreas Dilger a écrit :
 I don't think ldlm_timeout and obd_timeout have much effect when AT is 
 enabled. I believe that LLNL has some adjusted tunables for AT that might 
 help for you (increased at_min, etc).
 
 Hopefully Chris or someone at LLNL can comment. I think they were also 
 documented in bugzilla, though I don't know the bug number. 
 
 Cheers, Andreas
 
 On 2011-05-03, at 6:59 AM, DEGREMONT Aurelien aurelien.degrem...@cea.fr 
 wrote:
 
 
 Hello
 
 We often see some of our Lustre clients being evicted abusively (clients 
 seem healthy).
 The pattern is always the same:
 
 All of this on Lustre 2.0, with adaptative timeout enabled
 
 1 - A server complains about a client :
 ### lock callback timer expired... after 25315s...
 (nothing on client)
 
 (few seconds later)
 
 2 - The client receives -107 to a obd_ping for this target
 (server says @@@processing error 107)
 
 3 - Client realize its connection was lost.
 Client notices it was evicted.
 It reconnects.
 
 (To be sure) When client is evicted, all undergoing I/O are lost, no 
 recovery will be done for that?
 
 We are thinking to increase timeout to give more time to clients to 
 answer the ldlm revocation.
 (maybe it is just too loaded)
 - Is ldlm_timeout enough to do so?
 - Do we need to also change obd_timeout in accordance? Is there a risk 
 to trigger new timeouts if we just change ldlm_timeout (cascading timeout).
 
 Any feedback in this area is welcomed.
 
 Thank you
 
 Aurélien Degrémont
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] [Lustre-devel] Research on filesystem metadata operation distribution

2011-04-22 Thread Nathan Rutman
Hmm - also 'uptime' would really be nice to have with these so we could 
estimate rough rates...

On Apr 21, 2011, at 11:40 AM, Andreas Dilger wrote:

 I'm trying to get some data about the relative distribution of MDS operations 
 in the wild, and I'd be grateful if some people with production filesystems 
 that have been running for at least a week could collect some simple stats 
 and email them to me.  They can be collected by any regular user on the MDS 
 node:
 
lctl get_param mds.*.stats | egrep open|close|rename|link|attr|sync
 
 It would be useful to also include lfs df and lfs df -i information, as 
 well as a brief description of what the filesystem is used for (scratch, 
 home, project, archive, etc).
 
 
 
 As a reminder, I'm also interested if some Lustre admins could run the 
 fsstats tool from http://www.pdsi-scidac.org/fsstats/ and send me the 
 output.  Sending the output to PDSI via their submission form may also 
 produce some positive results.
 
http://www.pdsi-scidac.org/fsstats/files/fsstats-1.4.5.tar.gz
 
 
 Thanks in advance for any data.  I've set replies to go only to lustre-devel, 
 to avoid clogging the larger readership of lustre-discuss, but it may be 
 useful for others to have this in a list archive and/or searchable via Google 
 in the future so I don't necessarily want to keep it all to myself.
 
 Cheers, Andreas
 --
 Andreas Dilger 
 Principal Engineer
 Whamcloud, Inc.
 
 
 
 ___
 Lustre-devel mailing list
 lustre-de...@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-devel
__
This email may contain privileged or confidential information, which should 
only be used for the purpose for which it was sent by Xyratex. No further 
rights or licenses are granted to use such information. If you are not the 
intended recipient of this message, please notify the sender by return and 
delete it. You may not use, copy, disclose or rely on the information contained 
in it.
 
Internet email is susceptible to data corruption, interception and unauthorised 
amendment for which Xyratex does not accept liability. While we have taken 
reasonable precautions to ensure that this email is free of viruses, Xyratex 
does not accept liability for the presence of any computer viruses in this 
email, nor for any losses caused as a result of viruses.
 
Xyratex Technology Limited (03134912), Registered in England  Wales, 
Registered Office, Langstone Road, Havant, Hampshire, PO9 1SA.
 
The Xyratex group of companies also includes, Xyratex Ltd, registered in 
Bermuda, Xyratex International Inc, registered in California, Xyratex 
(Malaysia) Sdn Bhd registered in Malaysia, Xyratex Technology (Wuxi) Co Ltd 
registered in The People's Republic of China and Xyratex Japan Limited 
registered in Japan.
__
 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre community build server

2011-01-19 Thread Nathan Rutman
Hi Aurelien, Robert  - 

We also use Hudson and are interested in using it to do Lustre builds and 
testing.
 Hi
 
 Robert Read a écrit :
  Hi Aurélien,
 
  Yes, we've noticed Hudson's support for testing is not quite what we need, 
  so 
  we're planning to use Hudson to trigger our testing system, but not 
  necessarily to manage it.  We'd definitely be interested in learning more 
  about your experiences, though. 

 I do not know what you mean by triggering your testing system. But here 
 is what I set up.
 Hudson has only 1 slave node dedicated to testing Lustre 2.
 Hudson will launch a shell script through ssh to it.
 
 This script:
  - retrieves Lustre source (managed by Hudson git plugin)
  - compiles it.
  - launches acceptance-small with several parameters.
  - acceptance-small will connect to other nodes dedicated for these tests.
 
 acc-sm have been patched:
 - to be more error resilient (does not stop at first failure)
 - to generate a test report in JUNIT format.
Is this the yaml acc-sm that Robert was talking about, or an older one?
 
 Hudson fetch the junit report and parse it thanks to its plugin.
 Hudson can display in its interface all tests successes and failures.
 
 Everything goes fine as long as:
  - the testsuite leaves the node in a good shape. It is difficult to 
 have a automatic way to put the node back. Currently, we need to manualy 
 fix that.
Would it be helpful to run the test in a VM?  Hudson has a libvirt-slave plugin 
that
seems like it can start and stop a VM for you.  Another point I like about VM's 
is
that they can be suspended and shipped to an investigator for local debugging.

  - Hudson does not know about the other nodes used by acc-sm. And so can 
 trigger tests even if some sattelites nodes are unavailable.
Don't know if libvirt-slave can handle multiple
VM's for a multi-node Lustre test, but maybe it can be extended.
 
 How is you do this on your side?
It seems that like you, we are more interested in reporting the results
within Hudson as opposed to a different custom tool.

Nathan
 
 
 Aurélien
 
  robert
 
 
 
 
  On Dec 16, 2010, at 1:22 , DEGREMONT Aurelien wrote:
 

  Hi Robert,
 
  That's very interesting.
  At CEA we also have a Hudson platform and I'm running acceptance-small for 
  several Lustre branches in it. Hudson is a great tool but it was not 
  design 
  to test tools that run kernel-space that can crash your nodes or, at 
  least, 
  put your kernel in a bad shape. I will be very interested to share Hudson 
  experience testing Lustre and see how you've configured it for your own 
  tests.
 
 
  Aurélien
 
  Robert Read a écrit :
  
  Hi,
 
  As I mentioned the other day, Whamcloud is hosting a Hudson build server 
  and producing snapshot builds for Centos 5.x (and Ubuntu 10.4 when it 
  works) for both 1.8.x and 2.x branches. Our intention is for this to be a 
  resource for the Lustre community to find recent Lustre packages for 
  variety of Linux distributions. Early next year we'll connect this to our 
  test system so at least some of the packages can be tested, as well.
 
  We would be interested in hearing from anyone that would like to 
  participate producing builds. Hudson is an distributed system, and it's 
  easy to add more build nodes, even behind firewalls (some of us are 
  running 
  build VMs on our home machines). If you would like add another 
  distribution 
  or architecture we don't have yet, or even one we do have (the more the 
  merrier), we'd be happy to work with you to do that.  Please contact me 
  if 
  you are interested.  
  cheers,
  robert
 
 
 
 
  ___
  Lustre-discuss mailing list
  Lustre-discuss@lists.lustre.org
  http://lists.lustre.org/mailman/listinfo/lustre-discuss
   

 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Fwd: Lustre and Large Pages

2010-08-19 Thread Nathan Rutman
Jim, I'm forwarding this to lustre-discuss to get a broader community input.  
I'm sure somebody has some experience with this.

Begin forwarded message:
 
 I am looking for information on how Lustre assigns and holds pages on client 
 nodes across jobs.  The motivation is that we want to make huge pages 
 available to users.  We have found that it is almost impossible to allocate 
 very many huge pages since Lustre holds scattered small pages across jobs.  
 In fact, typically about 1/3 of compute node memory can be allocated as huge 
 pages.
 
 We have done quite a lot of performance studies which show that a substantial 
 percentage of jobs on Ranger have TLB misses as a major performance 
 bottleneck.  We estimate we might recover as much as an additional 5%-10% 
 throughput if users could use huge pages.
 
 Therefore we would like to find a way to minimize the client memory which 
 Lustre holds across jobs.
 
 Have you had anyone else mention this situation to you?
 
 Regards,
 
 Jim Browne
 
 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Problem with write_conf

2010-08-03 Thread Nathan Rutman
There's a 'failsafe' feature that  prevents filesystem name changes:
 LustreError: 157-3: Trying to start OBD AFTER-MDT_UUID using the wrong 
 disk BEFORE-MDT_UUID. Were the /dev/ assignments rearranged?
 
You'll have to go and delete the last_rcvd file off the disk for all the 
servers in the filesystem as well as tunefs --writeconf them all to the name 
AFTER name.  

On Aug 2, 2010, at 6:08 PM, Roger Spellman wrote:

 
 Hi,
 I would like to be able to change a file system name.  Towards that end, I 
 have run the following commands as an experiment:
 
   mkfs.lustre --reformat --fsname BEFORE  --device-size=1 --mgs --mdt  
 --mgsnode=10.2@o2ib0 /dev/mapper/map0
   dmesg -c
   mount -t lustre /dev/mapper/map0 /mnt/mdt
   dmesg -c
   umount /mnt/mdt
   dmesg -c
   tunefs.lustre --writeconf --fsname=AFTER --mgs --mdt /dev/mapper/map0
   dmesg -c
   mount -t lustre /dev/mapper/map0 /mnt/mdt
   dmesg -c
 
 Unfortunately, this does not work.  Can someone please explain the correct 
 sequence of commands to ues?  The output of each command is as follows.
 
 Thanks.
 
 [r...@ts-hss2-01 ~]# mkfs.lustre --reformat --fsname BEFORE  
 --device-size=1 --mgs --mdt  --mgsnode=10.2@o2ib0 /dev/mapper/map0
 
Permanent disk data:
 Target: BEFORE-MDT
 Index:  unassigned
 Lustre FS:  BEFORE
 Mount type: ldiskfs
 Flags:  0x75
   (MDT MGS needs_index first_time update )
 Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
 Parameters: mgsnode=10.2@o2ib mdt.group_upcall=/usr/sbin/l_getgroups
 
 device size = 1632256MB
 2 6 18
 formatting backing filesystem ldiskfs on /dev/mapper/map0
 target name  BEFORE-MDT
 4k blocks 2500
 options-i 4096 -I 512 -q -O dir_index,extents,uninit_groups -F
 mkfs_cmd = mke2fs -j -b 4096 -L BEFORE-MDT  -i 4096 -I 512 -q -O 
 dir_index,extents,uninit_groups -F /dev/mapper/map0 2500
 Writing CONFIGS/mountdata
 [r...@ts-hss2-01 ~]# dmesg -c
 LDISKFS-fs: barriers enabled
 kjournald2 starting: pid 1388, dev dm-4:8, commit interval 5 seconds
 LDISKFS FS on dm-4, internal journal on dm-4:8
 LDISKFS-fs: delayed allocation enabled
 LDISKFS-fs: file extents enabled
 LDISKFS-fs: mballoc enabled
 LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
 LDISKFS-fs: mballoc: 1 blocks 1 reqs (0 success)
 LDISKFS-fs: mballoc: 1 extents scanned, 0 goal hits, 1 2^N hits, 0 breaks, 0 
 lost
 LDISKFS-fs: mballoc: 1 generated and it took 2142
 LDISKFS-fs: mballoc: 512 preallocated, 0 discarded
 
 
 [r...@ts-hss2-01 ~]# mount -t lustre /dev/mapper/map0 /mnt/mdt
 [r...@ts-hss2-01 ~]# dmesg -c
 LDISKFS-fs: barriers enabled
 kjournald2 starting: pid 1406, dev dm-4:8, commit interval 5 seconds
 LDISKFS FS on dm-4, internal journal on dm-4:8
 LDISKFS-fs: delayed allocation enabled
 LDISKFS-fs: file extents enabled
 LDISKFS-fs: mballoc enabled
 LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
 LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)
 LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 
 lost
 LDISKFS-fs: mballoc: 0 generated and it took 0
 LDISKFS-fs: mballoc: 0 preallocated, 0 discarded
 LDISKFS-fs: barriers enabled
 kjournald2 starting: pid 1410, dev dm-4:8, commit interval 5 seconds
 LDISKFS FS on dm-4, internal journal on dm-4:8
 LDISKFS-fs: delayed allocation enabled
 LDISKFS-fs: file extents enabled
 LDISKFS-fs: mballoc enabled
 LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
 Lustre: MGS MGS started
 Lustre: mgc10.2@o2ib: Reactivating import
 Lustre: Setting parameter BEFORE-MDT.mdt.group_upcall in log 
 BEFORE-MDT
 Lustre: Enabling user_xattr
 Lustre: BEFORE-MDT: new disk, initializing
 Lustre: BEFORE-MDT: Now serving BEFORE-MDT on /dev/mapper/map0 with 
 recovery enabled
 Lustre: 1503:0:(lproc_mds.c:271:lprocfs_wr_group_upcall()) BEFORE-MDT: 
 group upcall set to /usr/sbin/l_getgroups
 Lustre: BEFORE-MDT.mdt: set parameter group_upcall=/usr/sbin/l_getgroups
 
 
 [r...@ts-hss2-01 ~]# umount /mnt/mdt
 [r...@ts-hss2-01 ~]# dmesg -c
 Lustre: Failing over BEFORE-MDT
 Lustre: Skipped 1 previous similar message
 Lustre: *** setting obd BEFORE-MDT device 'dm-4' read-only ***
 Turning device dm-4 (0xfd4) read-only
 Lustre: BEFORE-MDT: shutting down for failover; client state will be 
 preserved.
 Lustre: MDT BEFORE-MDT has stopped.
 LustreError: 1517:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Got rc -108 
 from cancel RPC: canceling anyway
 LustreError: 1517:0:(ldlm_request.c:1587:ldlm_cli_cancel_list()) 
 ldlm_cli_cancel_list: -108
 Lustre: MGS has stopped.
 LDISKFS-fs: mballoc: 3 blocks 3 reqs (0 success)
 LDISKFS-fs: mballoc: 8 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 
 lost
 LDISKFS-fs: mballoc: 1 generated and it took 2598
 LDISKFS-fs: mballoc: 1145 preallocated, 0 discarded
 Removing read-only on unknown block (0xfd4)
 Lustre: server umount BEFORE-MDT complete
 
 
 

Re: [Lustre-discuss] Problem with write_conf

2010-08-03 Thread Nathan Rutman

On Aug 3, 2010, at 11:25 AM, Roger Spellman wrote:

 Nathan,
  
 Thank you.   That works!
  
 I found that if I change IP address, I also need to remove the file  
 /mnt/mdt/CONFIGS/*-client.

This is what tunefs.lustre --writeconf on the MDT does, when you first mount it 
after the writeconf.
--writeconf on the MDT and all OSTs is the preferred way of changing a server 
nid.
 
  
 The reason is that the OST mounts failed – the OST was still looking for the 
 old IP Address.  I grepped for files with the old IP Address, and I found 
 those client files.
 
 Is that a safe thing to do?  Please note that my mdt and mgs are on the same 
 LUN.
  
 Thanks.
  
 -Roger
  
  
 From: Nathan Rutman [mailto:nathan.rut...@oracle.com] 
 Sent: Tuesday, August 03, 2010 2:03 PM
 To: Roger Spellman
 Cc: lustre-discuss@lists.lustre.org
 Subject: Re: [Lustre-discuss] Problem with write_conf
  
 There's a 'failsafe' feature that  prevents filesystem name changes:
 LustreError: 157-3: Trying to start OBD AFTER-MDT_UUID using the wrong 
 disk BEFORE-MDT_UUID. Were the /dev/ assignments rearranged?
 
 You'll have to go and delete the last_rcvd file off the disk for all the 
 servers in the filesystem as well as tunefs --writeconf them all to the name 
 AFTER name.  
  
 On Aug 2, 2010, at 6:08 PM, Roger Spellman wrote:
 
 
  
 Hi,
 I would like to be able to change a file system name.  Towards that end, I 
 have run the following commands as an experiment:
 
   mkfs.lustre --reformat --fsname BEFORE  --device-size=1 --mgs --mdt  
 --mgsnode=10.2@o2ib0 /dev/mapper/map0
   dmesg -c
   mount -t lustre /dev/mapper/map0 /mnt/mdt
   dmesg -c
   umount /mnt/mdt
   dmesg -c
   tunefs.lustre --writeconf --fsname=AFTER --mgs --mdt /dev/mapper/map0
   dmesg -c
   mount -t lustre /dev/mapper/map0 /mnt/mdt
   dmesg -c
 
 Unfortunately, this does not work.  Can someone please explain the correct 
 sequence of commands to ues?  The output of each command is as follows.
 
 Thanks.
 
 [r...@ts-hss2-01 ~]# mkfs.lustre --reformat --fsname BEFORE  
 --device-size=1 --mgs --mdt  --mgsnode=10.2@o2ib0 /dev/mapper/map0
 
Permanent disk data:
 Target: BEFORE-MDT
 Index:  unassigned
 Lustre FS:  BEFORE
 Mount type: ldiskfs
 Flags:  0x75
   (MDT MGS needs_index first_time update )
 Persistent mount opts: iopen_nopriv,user_xattr,errors=remount-ro
 Parameters: mgsnode=10.2@o2ib mdt.group_upcall=/usr/sbin/l_getgroups
 
 device size = 1632256MB
 2 6 18
 formatting backing filesystem ldiskfs on /dev/mapper/map0
 target name  BEFORE-MDT
 4k blocks 2500
 options-i 4096 -I 512 -q -O dir_index,extents,uninit_groups -F
 mkfs_cmd = mke2fs -j -b 4096 -L BEFORE-MDT  -i 4096 -I 512 -q -O 
 dir_index,extents,uninit_groups -F /dev/mapper/map0 2500
 Writing CONFIGS/mountdata
 [r...@ts-hss2-01 ~]# dmesg -c
 LDISKFS-fs: barriers enabled
 kjournald2 starting: pid 1388, dev dm-4:8, commit interval 5 seconds
 LDISKFS FS on dm-4, internal journal on dm-4:8
 LDISKFS-fs: delayed allocation enabled
 LDISKFS-fs: file extents enabled
 LDISKFS-fs: mballoc enabled
 LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
 LDISKFS-fs: mballoc: 1 blocks 1 reqs (0 success)
 LDISKFS-fs: mballoc: 1 extents scanned, 0 goal hits, 1 2^N hits, 0 breaks, 0 
 lost
 LDISKFS-fs: mballoc: 1 generated and it took 2142
 LDISKFS-fs: mballoc: 512 preallocated, 0 discarded
 
 
 [r...@ts-hss2-01 ~]# mount -t lustre /dev/mapper/map0 /mnt/mdt
 [r...@ts-hss2-01 ~]# dmesg -c
 LDISKFS-fs: barriers enabled
 kjournald2 starting: pid 1406, dev dm-4:8, commit interval 5 seconds
 LDISKFS FS on dm-4, internal journal on dm-4:8
 LDISKFS-fs: delayed allocation enabled
 LDISKFS-fs: file extents enabled
 LDISKFS-fs: mballoc enabled
 LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
 LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)
 LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 
 lost
 LDISKFS-fs: mballoc: 0 generated and it took 0
 LDISKFS-fs: mballoc: 0 preallocated, 0 discarded
 LDISKFS-fs: barriers enabled
 kjournald2 starting: pid 1410, dev dm-4:8, commit interval 5 seconds
 LDISKFS FS on dm-4, internal journal on dm-4:8
 LDISKFS-fs: delayed allocation enabled
 LDISKFS-fs: file extents enabled
 LDISKFS-fs: mballoc enabled
 LDISKFS-fs: mounted filesystem dm-4 with ordered data mode
 Lustre: MGS MGS started
 Lustre: mgc10.2@o2ib: Reactivating import
 Lustre: Setting parameter BEFORE-MDT.mdt.group_upcall in log 
 BEFORE-MDT
 Lustre: Enabling user_xattr
 Lustre: BEFORE-MDT: new disk, initializing
 Lustre: BEFORE-MDT: Now serving BEFORE-MDT on /dev/mapper/map0 with 
 recovery enabled
 Lustre: 1503:0:(lproc_mds.c:271:lprocfs_wr_group_upcall()) BEFORE-MDT: 
 group upcall set to /usr/sbin/l_getgroups
 Lustre: BEFORE-MDT.mdt: set parameter group_upcall=/usr/sbin/l_getgroups
 
 
 [r...@ts-hss2-01 ~]# umount /mnt/mdt
 [r...@ts-hss2-01 ~]# dmesg -c

Re: [Lustre-discuss] OST 1.6.6-1.8.3 upgrade

2010-07-27 Thread Nathan Rutman

On Jul 27, 2010, at 5:02 AM, Heiko Schröter wrote:

 Hello,
 
 
 running tunefs.lustre during the upgrade, do i have to add the --param 
 option when it has been used during the creation fo the fs ?
 
 i.e. lustre fs during creation: 
 mkfs.lustre --param=failover.mode=failout --fsname lustre --ost 
 --mkfsoptions='-i 2097152 -E stride=16 -b 4096' --mgsnode=m...@tcp0 /dev/sdb
 now upgrading:
 tunefs.lustre --param=failover.mode=failout --fsname lustre --ost 
 --mgsnode=m...@tcp0 /dev/sdb

You actually don't need to run tunefs unless you are changing something.  
Additionally, all parameters should be sticky across tunefs; you can see the 
current settings with 'tunefs --print', add new parameters with 'tunefs 
--param', and erase all old params with 'tunefs --erase'.



 A -writeconf is not intended during the upgrade.
In that case, most tunefs changes won't impact anything anyhow :)

 How long will it take (approx) to tunefs an 8TB OST partition ?
No time.  There's nothing that is done to the disk format during tunefs, just 
the modification of a single config file.

 
 Thanks and Regards
 Heiko
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Help with a question about lustre_rsync in Lustre 2.0 Beta1

2010-05-27 Thread Nathan Rutman
No.  This tool is intended to keep an entire (big) FS synched;
if all you need is a subdir you should use regular (not lustre-) rsync.
(I can't think of any reason why that functionality couldn't be added,
but it's not entirely trivial.)

On May 19, 2010, at 8:32 AM, Yujun Wu wrote:

 Hello,
 
 Could somebody help me with a question with the new lustre_rsync
 feature? I saw from the example that lustre_rsync can synchronize a 
 Lustre file system (/mnt/lustre) to a target file system (/mnt/target):
 
 $ lustre_rsync --source=/mnt/lustre --target=/mnt/target ..
 
 ,
 
 Can I just rsync a subdirectory like:
 
 $ lustre_rsync --source=/mnt/lustre/important_data
 --target=/mnt/target/backup ...
 
 ?
 
 
 Thanks in advance for answering my question.
 Yujun
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] 1.6.7.2 Replacing an OST

2010-03-30 Thread Nathan Rutman
You can, but in general this is probably not a good idea.  The  
filesystem will be looking for stripes for the old ost here, and they  
won't be there.
If there were no stripes on the old ost, you can mkfs --index and  
writeconf everywhere.  Writeconf will clear the conf_param deactivation.

On Mar 30, 2010, at 12:37 PM, Scott Barber sc...@imemories.com wrote:

 All servers are 1.6.7.2 CentOS 5 x86_64

 We had an OST that we removed previously and had been marked as  
 inactive by:
 lctl --device number deactivate
 and then
 lctl conf_param name.osc.active=0

 If I have a new OST to add to the system can I format it to replace
 the removed OST? Can I pass mkfs.lustre the old OST name (ie:
 OST000c)? If so do I have to reactivate the OST?

 Thanks,
 Scott Barber
 Senior System Administrator
 iMemories.com
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss