Re: [Lustre-discuss] New wc-discuss Lustre Mailing List
Hi, We've received feedback that some people are unwilling to join the wc-discuss list because they believe it requires using a gmail mail address. Luckily, this is not the case. It is possible to subscribe to wc-discuss (or any google group for that matter) using a non-gmail address. The key step is to to create a google (not gmail) account using your own email address. There are create an account now links on google login pages, or this link should take you to the sign up screen where you can use an already existing account: https://www.google.com/accounts/SignUpWidget Once created, your new account will work like any other web service account associated with your email address, and will allow you to sign up for google groups like wc-discuss. cheers, robert On Jun 29, 2011, at 12:01 , Andreas Dilger wrote: Hi, I'd like to announce the creation of a new mailing list for discussing Lustre releases from Whamcloud. We will also continue to monitor and participate on the existing lustre.org mailing lists, but we consider it prudent to host a separate list from lustre.org due to uncertainty regarding the long-term plans for lustre.org. Subscription information and archives are available at: https://groups.google.com/a/whamcloud.com/group/wc-discuss/ Cheers, Andreas -- Andreas Dilger Principal Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Does lustre 1.8 stop update and maintenance?
Hi, I cannot comment about Oracle's plans regarding Lustre, but Whamcloud does intend to continue supporting 1.8.x for some time. You can see activity related to 1.8.x (as well as 2.1) in http://jira.whamcloud.com. cheers, robert read Whamcloud, Inc On Mar 1, 2011, at 4:48 , Larry wrote: Hi, all Does lustre 1.8 stop update and maintenance? I have not seen any updates for a long time. Only Whamcloud releases the Lustre 2.1. Does it mean Oracle freeze the development of lustre? ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre community build server
Hi Aurélien, Yes, we've noticed Hudson's support for testing is not quite what we need, so we're planning to use Hudson to trigger our testing system, but not necessarily to manage it. We'd definitely be interested in learning more about your experiences, though. robert On Dec 16, 2010, at 1:22 , DEGREMONT Aurelien wrote: Hi Robert, That's very interesting. At CEA we also have a Hudson platform and I'm running acceptance-small for several Lustre branches in it. Hudson is a great tool but it was not design to test tools that run kernel-space that can crash your nodes or, at least, put your kernel in a bad shape. I will be very interested to share Hudson experience testing Lustre and see how you've configured it for your own tests. Aurélien Robert Read a écrit : Hi, As I mentioned the other day, Whamcloud is hosting a Hudson build server and producing snapshot builds for Centos 5.x (and Ubuntu 10.4 when it works) for both 1.8.x and 2.x branches. Our intention is for this to be a resource for the Lustre community to find recent Lustre packages for variety of Linux distributions. Early next year we'll connect this to our test system so at least some of the packages can be tested, as well. We would be interested in hearing from anyone that would like to participate producing builds. Hudson is an distributed system, and it's easy to add more build nodes, even behind firewalls (some of us are running build VMs on our home machines). If you would like add another distribution or architecture we don't have yet, or even one we do have (the more the merrier), we'd be happy to work with you to do that. Please contact me if you are interested. cheers, robert ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre community build server
Hi, On Dec 16, 2010, at 9:42 , DEGREMONT Aurelien wrote: Hi Robert Read a écrit : Hi Aurélien, Yes, we've noticed Hudson's support for testing is not quite what we need, so we're planning to use Hudson to trigger our testing system, but not necessarily to manage it. We'd definitely be interested in learning more about your experiences, though. I do not know what you mean by triggering your testing system. But here is what I set up. I mean that once the build is complete we will notify the test system that new build is ready to be picked up and tested. We haven't yet implemented that part of it. Hudson has only 1 slave node dedicated to testing Lustre 2. Hudson will launch a shell script through ssh to it. This script: - retrieves Lustre source (managed by Hudson git plugin) - compiles it. - launches acceptance-small with several parameters. - acceptance-small will connect to other nodes dedicated for these tests. acc-sm have been patched: - to be more error resilient (does not stop at first failure) - to generate a test report in JUNIT format. Hudson fetch the junit report and parse it thanks to its plugin. Hudson can display in its interface all tests successes and failures. Everything goes fine as long as: - the testsuite leaves the node in a good shape. It is difficult to have a automatic way to put the node back. Currently, we need to manualy fix that. - Hudson does not know about the other nodes used by acc-sm. And so can trigger tests even if some sattelites nodes are unavailable. How is you do this on your side? We don't plan to use Hudson to manage our testing results as I don't think it would scale very well for all the testing we might do for each build. We're currently building a more custom results server that's similar (in spirit at least) to the kinds of tools we had at Oracle. We'll make it available once it's in presentable form. Actually, our first step was to replace the acceptance-small.sh driver script with one that has a more sensible user interface for running the standard tests. Since the test-framework.sh on master has already been changed to produce test results in yaml format, the new script collects these with the logs, and is capable of submitting them to the test results server. Currently this is being run manually, though. Automating the test execution and connecting all the pieces will be next step. cheers, robert Aurélien robert On Dec 16, 2010, at 1:22 , DEGREMONT Aurelien wrote: Hi Robert, That's very interesting. At CEA we also have a Hudson platform and I'm running acceptance-small for several Lustre branches in it. Hudson is a great tool but it was not design to test tools that run kernel-space that can crash your nodes or, at least, put your kernel in a bad shape. I will be very interested to share Hudson experience testing Lustre and see how you've configured it for your own tests. Aurélien Robert Read a écrit : Hi, As I mentioned the other day, Whamcloud is hosting a Hudson build server and producing snapshot builds for Centos 5.x (and Ubuntu 10.4 when it works) for both 1.8.x and 2.x branches. Our intention is for this to be a resource for the Lustre community to find recent Lustre packages for variety of Linux distributions. Early next year we'll connect this to our test system so at least some of the packages can be tested, as well. We would be interested in hearing from anyone that would like to participate producing builds. Hudson is an distributed system, and it's easy to add more build nodes, even behind firewalls (some of us are running build VMs on our home machines). If you would like add another distribution or architecture we don't have yet, or even one we do have (the more the merrier), we'd be happy to work with you to do that. Please contact me if you are interested. cheers, robert ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] Lustre community build server
Hi, As I mentioned the other day, Whamcloud is hosting a Hudson build server and producing snapshot builds for Centos 5.x (and Ubuntu 10.4 when it works) for both 1.8.x and 2.x branches. Our intention is for this to be a resource for the Lustre community to find recent Lustre packages for variety of Linux distributions. Early next year we'll connect this to our test system so at least some of the packages can be tested, as well. We would be interested in hearing from anyone that would like to participate producing builds. Hudson is an distributed system, and it's easy to add more build nodes, even behind firewalls (some of us are running build VMs on our home machines). If you would like add another distribution or architecture we don't have yet, or even one we do have (the more the merrier), we'd be happy to work with you to do that. Please contact me if you are interested. cheers, robert ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Multi-Role/Tasking MDS/OSS Hosts
Hi, On Sep 17, 2010, at 14:49 , Bernd Schubert wrote: Hello Cory, On 09/17/2010 11:31 PM, Cory Spitz wrote: Hi, Bernd. On 09/17/2010 02:48 PM, Bernd Schubert wrote: On Friday, September 17, 2010, Andreas Dilger wrote: On 2010-09-17, at 12:42, Jonathan B. Horen wrote: We're trying to architect a Lustre setup for our group, and want to leverage our available resources. In doing so, we've come to consider multi-purposing several hosts, so that they'll function simultaneously as MDS OSS. You can't do this and expect recovery to work in a robust manner. The reason is that the MDS is a client of the OSS, and if they are both on the same node that crashes, the OSS will wait for the MDS client to reconnect and will time out recovery of the real clients. Well, that is some kind of design problem. Even on separate nodes it can easily happen, that both MDS and OSS fail, for example power outage of the storage rack. In my experience situations like that happen frequently... I think that just argues that the MDS should be on a separate UPS. well, there is not only a single reason. Next hardware issue is that maybe an IB switch fails. And then have also seen cascading Lustre failures. It starts with an LBUG on the OSS, which triggers another problem on the MDS... Also, for us this actually will become a real problem, which cannot be easily solved. So this issue will become a DDN priority. There is always a possibility that multiple failures will occur, and this possibility can be reduced depending on one's resources. The point here is simply that a configuration with an mds and oss on the same node will guarantee multiple failures and aborted OSS recovery when that node fails. cheers, robert ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Multi-Role/Tasking MDS/OSS Hosts
Hi, On Sep 17, 2010, at 12:48 , Bernd Schubert wrote: On Friday, September 17, 2010, Andreas Dilger wrote: On 2010-09-17, at 12:42, Jonathan B. Horen wrote: We're trying to architect a Lustre setup for our group, and want to leverage our available resources. In doing so, we've come to consider multi-purposing several hosts, so that they'll function simultaneously as MDS OSS. You can't do this and expect recovery to work in a robust manner. The reason is that the MDS is a client of the OSS, and if they are both on the same node that crashes, the OSS will wait for the MDS client to reconnect and will time out recovery of the real clients. Well, that is some kind of design problem. Even on separate nodes it can easily happen, that both MDS and OSS fail, for example power outage of the storage rack. In my experience situations like that happen frequently... I think some kind a pre-connection would be required, where a client can tell a server, that it was rebooted and that the server shall not to wait any longer for it. Actually, shouldn't be that difficult, as already different connection flags exist. So if the client contacts a server and ask for an initial connection, the server could check for that NID and then immediately abort recovery for that client. This is an interesting idea, but NID is not ideal as this wouldn't be compatible with multiple mounts on the same node. Not very useful in production, perhaps, but very useful for testing. Another option would be to hash the mount point pathname (and some other data, such as the NID) and use this as the client uuid. Then the client uuid would be persistent across reboots and the server would rely on flags to detect if this was a reconnect or a new connection after a reboot or remount. robert ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] kernel: BUG: soft lockup - CPU stuck for 10s! with lustre 1.8.4
Hi Peter, Perhaps the link got mangled by your mail client? (It does have some seemingly unusual characters for an URL.) My interpretation of Gabriele's reply is that the problem occurred even with statahead disabled, so in that case this patch might be worth trying. robert On Sep 17, 2010, at 10:18 , Peter Jones wrote: The URL does not work for me, but if it is a statahead issue then surely turning statahead off would be a simple workaround to avoid having to apply a patch. Fan Yong wrote: On 9/14/10 8:55 PM, Gabriele Paciucci wrote: I have the same problem, I put the statahead_max to 0 !!! In fact, I have made a patch for statahead related issues (including this one) against lustre-1.8, which is in inspection. http://review.whamcloud.com/#change,2 If possible, you can try such patch. Cheers, -- Nasf ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Local Northern California Lustre Users Group
Yes, I remember discussing this back at the LUG too, and it sounds good to me. I agree that an evening meeting would probably be better in general. robert On Aug 4, 2010, at 10:06 AM, D. Marc Stearman wrote: I think I was one of those suggesting the same thing. I think a lunch time meeting is fine for folks on the Peninsula, or in the South Bay, but for those of us in the far East Bay it could take an hour to get there. I was thinking a monthly meeting sometime in the evening for dinner/drinks would work well. -Marc D. Marc Stearman Lustre Operations Lead m...@llnl.gov 925.423.9670 Pager: 1.888.203.0641 On Jun 11, 2010, at 3:54 PM, Sebastian Gutierrez wrote: Hello, While at LUG 2010, a few of us mentioned that we would be interested in a local LUG (LLUG?) in the bay area. We discussed that it may be a good idea to meet up about once a month to discuss issues and solutions. If you are still interested I am able to provide the initial meeting spot for our meeting. I am offering Stanford as our first meeting spot. If we can settle on a date we can start meeting either at the end of the month or early next month. I was thinking we could meet on the second Monday or Wednesday of each month, during lunch. What I need: a date that a majority could agree upon, or I can choose a date. Once we decide on a date I will set up a RSVP due date. My rough outline of the time would be: 1. Current updates, if any 2. Someone present an issue 3. We discuss solutions 4. goto 2 Cheers, Sebastian ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://*lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre-1.9.210: mkfs.ldiskfs?
On Jul 1, 2009, at 10:03 , Josephine Palencia wrote: OS: Centos 5.3, x86_64 Kernel: 2.6.18-128.1.6 [r...@attractor ~]# cat /proc/fs/lustre/version lustre: 1.9.210 .. Looking for mkfs.ldiskfs during mkfs.lustre? Did a find on previous working lustre-1.9 working and didn't find one. Ps advice. What configure options did you use when you built lustre? I think -- with-ldiskfsprogs will configure lustre to use these names, however the standard version the e2fs utils doesn't use this naming scheme. This would be used, for instance, if you needed to have both the original and ldiskfs version of e2fsutils installed at the same time. robert [r...@attractor ~]# mkfs.lustre --fsname=jwan0 --mdt --mgs /dev/hdc1 Permanent disk data: Target: jwan0-MDT Index: unassigned Lustre FS: jwan0 Mount type: ldiskfs Flags: 0x75 (MDT MGS needs_index first_time update ) Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr Parameters: checking for existing Lustre data: not found device size = 19085MB 2 6 18 formatting backing filesystem ldiskfs on /dev/hdc1 target name jwan0-MDT 4k blocks 0 options-J size=400 -i 4096 -I 512 -q -O dir_index,uninit_groups -F mkfs_cmd = mkfs.ldiskfs -j -b 4096 -L jwan0-MDT -J size=400 -i 4096 -I 512 -q -O dir_index,uninit_groups -F /dev/hdc1 sh: mkfs.ldiskfs: command not found mkfs.lustre FATAL: Unable to build fs /dev/hdc1 (32512) mkfs.lustre FATAL: mkfs failed 32512 Thanks, josephin ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre and Xen
On Oct 4, 2008, at 08:33 , Daniel Ferber wrote: I have two questions: 1. On the OSS/OST side, can one run Lustre as a server on a system running a Xen kernel? 2. Conversely: is the lustre client compatible with Xen kernel machines? The answer to both questions is yes - I routinely test Lustre servers and clients on Xen using a RHEL 5 kernel with our patches and configured as a Xen guest. I have not tried a patchless client on a RHEL 5 Xen kernel yet, but I would expect that to work if you built our modules for it. robert Thanks much, Dan -- Dan Ferber, [EMAIL PROTECTED] +1 612-486-5167 (Office) +1 651-356-9481 (Cell x29704 (Sun Internal) ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Lustre and Xen
On Oct 6, 2008, at 16:03 , Andreas Dilger wrote: On Oct 04, 2008 10:33 -0500, Daniel Ferber wrote: I have two questions: 1. On the OSS/OST side, can one run Lustre as a server on a system running a Xen kernel? 2. Conversely: is the lustre client compatible with Xen kernel machines? Note that work and perform well are two completely different issues. While CPU virtualization is quite efficient these days, IO virtualization is not necessarily very fast at all. I haven't benchmarked IO on Xen in a long time, but I'd expect a pretty reasonable percentage of native IO at this point. Not quite as efficient as cpu virtualization, of course, but I wouldn't necessarily say not very fast either. robert ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss