Re: [Lustre-discuss] Enqueue wait from MDS log

2010-01-06 Thread Andreas Dilger
but I don't know when that was done, so it might not appear until 1.8.2. > On Wed, Jan 6, 2010 at 5:22 PM, Andreas Dilger > wrote: >> On 2010-01-06, at 01:42, Tung Dam wrote: >> I have an issue with lustre log from our MDS, like this: >> >> Jan 6 14:00:

Re: [Lustre-discuss] Enqueue wait from MDS log

2010-01-06 Thread Andreas Dilger
re. In particular, with FLK (flock) type locks, they can be held indefinitely, so there is no reason to print a message at all. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discu

Re: [Lustre-discuss] The client profile could not be read from the MGS

2010-01-05 Thread Andreas Dilger
unt the clients at some point before you make any changes to the configuration in the future (e.g. adding an OST or setting tunables) as the currently- mounted clients will likely not detect these due to the new configration that was created. Cheers, Andreas -- A

Re: [Lustre-discuss] MDS crashes daily at the same hour

2010-01-04 Thread Andreas Dilger
gt; Jan 4 06:33:31 tech-mds kernel: child_rip+0xa/0x11 > Jan 4 06:33:31 tech-mds kernel: :ptlrpc:ptlrpc_main+0x0/0x13e0 > Jan 4 06:33:31 tech-mds kernel: child_rip+0x0/0x11 It shouldn't LBUG during recovery, however. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustr

Re: [Lustre-discuss] OST crashed after slow journal messages

2010-01-01 Thread Andreas Dilger
On 2010-01-01, at 17:20, Erik Froese wrote: > On Thu, Dec 31, 2009 at 4:52 PM, Andreas Dilger > wrote: >> These are usually a sign that the back-end storage is overloaded, >> or somehow performing very slowly. Maybe there was a RAID rebuild >> going on? > >

Re: [Lustre-discuss] Lustre reading file and RAM

2010-01-01 Thread Andreas Dilger
at file. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] OST crashed after slow journal messages

2009-12-31 Thread Andreas Dilger
> the clients. Also lfs find /scratch -O scratch-OST000e_UUID HANGS! You also need to deactivate it on the clients, at which point they will get an IO error when accessing files on that OST. > Are we screwed here? Is there a way to run lfs find with the OST > disabled? Shouldn't

Re: [Lustre-discuss] NFS re-exporting lustre

2009-12-31 Thread Andreas Dilger
exports freezing after a few minutes of heavy i/o and > having to reboot the nfs server for activity to resume. I don't do this in any serious capacity, but I use NFS re-export to share my music collection to the Mac clients in my house without trouble. Cheers, Andreas -- Andreas Dilger S

Re: [Lustre-discuss] MD1000 woes and OSS migration suggestions

2009-12-31 Thread Andreas Dilger
ented as part of the HSM project. The HSM functionality will be available in the 2.1 release (end 2010/early 2011), and online migration can be completed after that time. > On Dec 30, 2009, at 5:44 PM, Andreas Dilger wrote: >> On 2009-12-29, at 19:33, Nick Jennings wrote: >>> Hi

Re: [Lustre-discuss] What stripe size and extent size to choose for Lustre

2009-12-30 Thread Andreas Dilger
r if a single client needs more bandwidth than what a single OST can provide. Otherwise, adding more stripes just adds overhead to file IO. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lust

Re: [Lustre-discuss] MD1000 woes and OSS migration suggestions

2009-12-30 Thread Andreas Dilger
gt; offline, and connect it to the second partition of new MD1000 and > bring > that end online once more. There is a section in the manual about "manual data migration", which should let you move the data. Note that this is not 100% transparent to applications, but it is safe i

Re: [Lustre-discuss] stripe offset and hot-spots

2009-12-30 Thread Andreas Dilger
On 2009-12-29, at 13:51, Christopher J. Morrone wrote: > Andreas Dilger wrote: >> Well, that is already the default, unless it has been changed at >> some time in the past by someone at your site. We generally >> recommend against ever changing the starting index of fi

Re: [Lustre-discuss] e2scan wrong file list mtime/ctime

2009-12-23 Thread Andreas Dilger
ith "-w" (write mode). Without that you shouldn't be able to modify the filesystem, and it is safe. > Andreas Dilger wrote: >> On 2009-12-16, at 06:28, Miguel Molowny Lopez wrote: >> >>> we are running lustre 1.8.1.1 on our storage cluster based on IB. >

Re: [Lustre-discuss] lustre 1.6.7.2 client kernel panic

2009-12-21 Thread Andreas Dilger
, > it's the only thing that I was doing on the server. Normal operation > on > the filesystem is Apache & nothing else. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lus

Re: [Lustre-discuss] performance tuning w/ dbench, bonnie++

2009-12-21 Thread Andreas Dilger
ne Lustre further, depending on what it is your application is doing, but it is pointless to optimize for dbench, since it is unlikely to be doing exactly what your application is doing. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lu

Re: [Lustre-discuss] MGT of 128 MB - already out of space

2009-12-18 Thread Andreas Dilger
gt; Is it possible to run --writeconf to fix this? If all of the space is really consumed by the config files, are you using a lot of "lctl conf_param" commands, ost pools, or something else that would put a lot of records into the config logs? Cheers, Andreas -- Andreas D

Re: [Lustre-discuss] async journals

2009-12-18 Thread Andreas Dilger
e time, there is enough IO per transaction that the commit does not noticably affect performance. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] e2scan wrong file list mtime/ctime

2009-12-16 Thread Andreas Dilger
via "debugfs -c -R 'stat foobar.txt'", in addition to checking via "stat /scratch/foobar.txt" in the filesystem. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] failover problems

2009-12-11 Thread Andreas Dilger
7 > One Cyclotron Rd, MS: 50B-3209C > Lawrence Berkeley National Lab > Berkeley, CA 94720 > > > > > > > > > ___ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] abnormally long ftruncates on Cray XT4

2009-12-11 Thread Andreas Dilger
the > consistency and magnitude of these hangs. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] 1.8.1.1

2009-12-10 Thread Andreas Dilger
different issue. The previous stack was busy in lustre_hash_for_each_empty(). > Should I install this patched lustre on the clients too? Or is the > problem something else? This is strictly a server-side patch. This looks like a similar issue, and may be fixed by one of the other patches

Re: [Lustre-discuss] Heartbeat, LVM and Lustre

2009-12-10 Thread Andreas Dilger
t wrote for GFS2. I'm not sure if they are public or not, but in any case, since Lustre/ldiskfs expects sole ownership of the LVs (and the filesystems therein) there isn't any benefit to having them imported on 2 nodes at once, but a lot of risk. Cheers, Andreas -- Andreas

Re: [Lustre-discuss] fsck of OST problems - endless loop restarting pass 1

2009-12-05 Thread Andreas Dilger
ion.Anyone can > give some suggestion? You should update to a newer version of Lustre. > 2009/12/4 Craig Prescott > Craig Prescott wrote: > > Andreas Dilger wrote: > >> Hmm, the code shouldn't be checking the checksums if the uninit_bg > >> feature is not en

Re: [Lustre-discuss] creating and using loopback device on a file on a lustre filesystem?

2009-12-05 Thread Andreas Dilger
quot;safe", regardless of whether this is Lustre or not. Some users do this (with ext2) to have a node-local mechanism for accessing a lot of small files. It is also possible to use such loopback files in read-only mode (again with ext2 only) from many nodes at one time. Cheers, Andr

Re: [Lustre-discuss] Performances and fsync()

2009-12-05 Thread Andreas Dilger
uot;, MPI_Wtime()-t1); > > close(fd); > } > > MPI_Barrier(MPI_COMM_WORLD); > fd = open(opt_filename, O_RDWR); > MPI_Barrier(MPI_COMM_WORLD); > > t1 = MPI_Wtime(); > err=fsync(fd); > printf("%.2d: sync : %.6f (err=%

Re: [Lustre-discuss] 1.8.1.1

2009-12-05 Thread Andreas Dilger
t to patch the kernel again? This is trying to build the ldiskfs module from the ext3 sources. It _should_ work, given that you have the right kernel sources, but clearly either the patch was changed, or something is different between your ext3 and what the patch expects. This is norm

Re: [Lustre-discuss] client I/O

2009-12-04 Thread Andreas Dilger
ram debug=+rpctrace sleep 20 lctl dk /tmp/debug grep "Handled.*:[34]$" /tmp/debug Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustr

Re: [Lustre-discuss] Re-balance the un-balanced OSTs

2009-12-04 Thread Andreas Dilger
tmp /path/to/some/files > > However, this way seems does not help too much. I guess this is > because > we do not pull out the files which locate in cwarp-OST exactly. You can find files that have stripes that OST with: lfs find -obd cwarp-OST0000_UUID /path/to/lustre You can

Re: [Lustre-discuss] OST I/O problems

2009-12-04 Thread Andreas Dilger
o RPC can complete before the timeout, or because there is packet loss. Some things to try: - reduce the number of OSS threads via module parameter: option ost oss_num_threads=N - increase the lustre timeout (details in the manual) Cheers, Andreas --

Re: [Lustre-discuss] Oracle Linux instead of SUSE for 2.0?

2009-12-03 Thread Andreas Dilger
dentally had filed a bug yesterday describing details of how to remove the patches Lustre adds to the core kernel (though some patches would still be needed for ext4). If anyone is interested to track (or better yet, help) this effort, see bug 21524 for details: https://bugzilla.lustre.org/sh

Re: [Lustre-discuss] command akin to "dump waiters"

2009-12-02 Thread Andreas Dilger
_namespaces=1; lctl dk / tmp/debug" but it may not be what you are looking for, since it will only show that node's lock state. I've never seen the output of "mmfsadm dump waiters", so I can't comment on whether they are at all related. Cheers, Andreas -- An

Re: [Lustre-discuss] high IOPS

2009-12-02 Thread Andreas Dilger
On 2009-12-02, at 12:15, Craig Tierney wrote: > Andreas Dilger wrote: >> On 2009-12-02, at 09:20, Francois Chassaing wrote: >>> I have a big fundamental question : >>> if the load that I'll put on the FS is more IOPS-intensive than >>> throughput-intensive

Re: [Lustre-discuss] fsck of OST problems - endless loop restarting pass 1

2009-12-02 Thread Andreas Dilger
struct ext4_group_desc *gdp) { if ((sbi->s_es->s_feature_ro_compat & cpu_to_le32(LDISKFS_FEATURE_RO_COMPAT_GDT_CSUM)) && (gdp->bg_checksum != ldiskfs_group_desc_csum(sbi, block_group, gdp)))

Re: [Lustre-discuss] high IOPS

2009-12-02 Thread Andreas Dilger
than lustre but > poorly failed the bonnie++ create/delete tests. Also I didn't gave a > shot at PVFS2 yet... Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing

Re: [Lustre-discuss] fsck of OST problems - endless loop restarting pass 1

2009-12-01 Thread Andreas Dilger
On 2009-12-01, at 19:01, Craig Prescott wrote: > Andreas Dilger wrote: >> I would start by simply trying to mount the OST filesystem with >> ldiskfs directly (mount options "-o ro" to avoid any further >> corruption or errors, and possibly also "noload&q

Re: [Lustre-discuss] fsck of OST problems - endless loop restarting pass 1

2009-12-01 Thread Andreas Dilger
8) > ... (inode #114786311, mod time Fri Oct 10 14:03:48 2008) > ... (inode #114786309, mod time Fri Oct 10 14:03:48 2008) > ??? (inode #114786305, mod time Fri Oct 10 14:03:48 2008) > Clone multiply-claimed blocks? yes > > ... > __

Re: [Lustre-discuss] New lustre setup - opinions sought

2009-12-01 Thread Andreas Dilger
his would be an interesting feature. From your description, I can't see any benefit to having these different OSTs in the same filesystem Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _

Re: [Lustre-discuss] I/O on cluster with lustre

2009-12-01 Thread Andreas Dilger
else - check /var/log/messages to see if there are Lustre (or other) errors - do "echo t > /proc/sysrq-trigger" to dump the stacks of all processes on the system, and see where your job is stuck Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsyste

[Lustre-discuss] Presentations from Lustre Portland technical meetings

2009-11-27 Thread Andreas Dilger
ki under: http://wiki.lustre.org/index.php/Lustre_Community_Events,_Conferences_and_Meetings In particular, some of the longer-term scalability improvements for the HPCS project have design documents available at: http://wiki.lustre.org/index.php/Lustre_HPCS_Activities Cheers, Andreas -- Andreas Dilger Sr.

Re: [Lustre-discuss] 1.8.1.1

2009-11-27 Thread Andreas Dilger
ause, but you could check the DLM lock stats in /proc/fs/lustre/ldlm/namespaces/*/lock_count on some clients, to see how many locks they are holding, or the same on the MDS, which will be the total number of locks currently granted to all clients. > After them: > > 6009:0:(events.c

Re: [Lustre-discuss] Extent Based Locking Implementation

2009-11-25 Thread Andreas Dilger
unables for this are: /proc/fs/lustre/ldlm/{OST}/contended_locks - number of lock conflicts before a lock is contended (default 4) /proc/fs/lustre/ldlm/{OST}/contention_seconds - seconds to be "conflicted" state until normal locking (default 2s) /proc/fs/lustre/ldlm/{OST}/max_nol

Re: [Lustre-discuss] stripe offset and hot-spots

2009-11-25 Thread Andreas Dilger
is 2, but some OSTs get 4 > objects and others get no objects then the application may see an > aggregate performance drop of 50% or more, if it were using random > object distribution. With round-robin distribution, every OST will > get 2 objects (assuming objects / OSTs is a whole number)

Re: [Lustre-discuss] question about failnode with mixed networks

2009-11-24 Thread Andreas Dilger
.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> John White >> High Performance Computing Services (HPCS) >> (510) 486-7307 >> One Cyclotron Rd, MS: 50B-3209C >> Lawrence Berkeley National Lab >> Be

Re: [Lustre-discuss] stripe offset and hot-spots

2009-11-24 Thread Andreas Dilger
mple, if the average objects per OST is 2, but some OSTs get 4 objects and others get no objects then the application may see an aggregate performance drop of 50% or more, if it were using random object distribution. With round-robin distribution, every OST will get 2 objects (assuming

Re: [Lustre-discuss] (no subject)

2009-11-21 Thread Andreas Dilger
rs to provide the storage, and Lustre will export it as a single filesystem to clients. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.or

Re: [Lustre-discuss] Using drbd: reformat disk or only sync ?

2009-11-21 Thread Andreas Dilger
RBD myself, but I believe that it should NOT require formatting a device before using DRBD on it. However, there would need to be an initial synchronization to copy all of the data from the primary copy to the backup. DRBD is just doing a block-level copy of one device to another, it

Re: [Lustre-discuss] Anyone built 1.8 or 1.6 on Fedora 12's 2.6.31 yet?

2009-11-20 Thread Andreas Dilger
at's already > been done. The 2.6.27 support (both client and server) should be in 1.8.1.1 AFAIK, because it runs on SLES11. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lu

Re: [Lustre-discuss] lustre setup with several subnets

2009-11-20 Thread Andreas Dilger
ed as a cifs gateway), as this is > only a You are far better off to bond the interfaces and present Lustre with a single bond0 interface. Less complexity, aggregate bandwidth for all connections, and redundancy at the network level. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer,

Re: [Lustre-discuss] Adaptive Timeouts in Lustre 1.6.x vs. 1.8.x

2009-11-20 Thread Andreas Dilger
but I'm not 100% sure that there weren't bug fixes from the version that was landed in 1.6 vs. what is in 1.8 today. Maybe someone else can comment? Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___

Re: [Lustre-discuss] mkfs.lustre stripe-count-hint

2009-11-17 Thread Andreas Dilger
> Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] Determine addresses of connected clients/servers

2009-11-17 Thread Andreas Dilger
how the routers, AFAIR. You can see this in /proc/fs/lustre/{obdfilter,mds}/{target}/exports to list the nids, or other per-client statistics. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-

Re: [Lustre-discuss] Size of MGT?

2009-11-17 Thread Andreas Dilger
AFAIR, ORNL has an MGT data of about 128MB, but that is the largest filesystem in the world. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org

Re: [Lustre-discuss] question about failnode with mixed networks

2009-11-15 Thread Andreas Dilger
n all of the nodes to re-do the filesystem configuration. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/ma

Re: [Lustre-discuss] e2fsck --mdsdb segmentation fault

2009-11-13 Thread Andreas Dilger
id a real e2fsck (no "-n", but on an unmounted filesystem) then the journal will be replayed and no errors should be seen. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] e2fsck --mdsdb segmentation fault

2009-11-12 Thread Andreas Dilger
ing to do with Lustre, but is either bad disks/controllers/cables/cache that is corrupting your filesystem. You need to fix the underlying storage problem first. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _

Re: [Lustre-discuss] osc lost on MDS server

2009-11-11 Thread Andreas Dilger
5.mount -t lustre mdtdevice mountpoint Where is it documented to delete all of the files in CONFIGS? This deletes the action of step #2 above, and isn't a good idea. Presumably there was also a step 4b to unmount the filesystem from type lfdiskfs? Cheers, Andreas -- Andreas Dilger Sr. St

Re: [Lustre-discuss] Has anybody seen: ldiskfs_get_inode_block: bad inode number: 1

2009-11-09 Thread Andreas Dilger
nfidential and proprietary information. If you have received this > message in error, please notify us and remove it from your system > and note that you must not copy, distribute or take any action in > reliance on it. Any unauthorized use or disclosure of the contents > of this message i

Re: [Lustre-discuss] how to define 60 failnodes

2009-11-09 Thread Andreas Dilger
stem, rather than waiting for the client to time out its RPC and poke around trying to find which of the failover servers is controlling the OST/MDT. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lus

Re: [Lustre-discuss] Lustre patchless client status ?

2009-11-09 Thread Andreas Dilger
On 2009-11-09, at 06:41, Marc Mendez-Bermond wrote: > I wanted to check if the Lustre patchless client will be kept > developped and maintained for the next couple of years ? Sure, why wouldn't it be? We definitely don't want to go back to patched clients... Cheers, An

Re: [Lustre-discuss] 1.8.1.1, 2.6.27.29-0.1_lustre.1.8.1.1-default and HIGHMEM 64G

2009-11-07 Thread Andreas Dilger
customers using this kernel with 8GB or more of RAM. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listin

Re: [Lustre-discuss] lnet & client client access

2009-11-05 Thread Andreas Dilger
t; experimental feature, > with 1.8.1 / 1.8.1.1 ? This is only an experimental feature in the HEAD code base, not in 1.8. At one time HEAD was going to be released as 1.8, but it is now going to be released as 2.0. Cheers, Andreas -- Andreas D

Re: [Lustre-discuss] how to move OST to different OSS

2009-11-04 Thread Andreas Dilger
SSes? If > so, how do I > do this without data loss? The Lustre version is 1.6.7. You should be able to, but you will need to use --writeconf. This is documented in the Lustre manual (it is the same as changing the IP address of an OSS). Cheers, Andreas -- Andreas Dilger Sr. Staff Engin

Re: [Lustre-discuss] OSS extremely slow in response, ll_ost load high

2009-11-03 Thread Andreas Dilger
; in the running system? Of course I'm not sure they are the root of the > problem... Well, the ll_ost_* threads are the ones that are doing the actual work of handling the RPCs, so you can't get rid of them. The modprobe.conf oss_num_threads line is forcing the startup

Re: [Lustre-discuss] Core dumps on Lustre 1.6.5

2009-11-03 Thread Andreas Dilger
in this area at one time that was fixed, but I don't know the bug number or what version it was fixed in. The current stable 1.6 release is 1.6.7.2. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. __

Re: [Lustre-discuss] intrepid kernel on jaunty

2009-11-02 Thread Andreas Dilger
inux-headers-generic, rsync You mean the reverse of this should land, right? Lustre doesn't depend on automake1.7 specifically (it can work on any version up to 1.10). Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___

Re: [Lustre-discuss] Bad distribution of files among OSTs

2009-11-01 Thread Andreas Dilger
did not > find any other odd thing than the filling levels of > the OSTs. Yes, this is entirely possible. If the OST is very full, then it takes longer to find free blocks. > Andreas Dilger wrote: >> On 2009-10-30, at 12:07, Thomas Roth wrote: >>> in our 196 OST - Cluster,

Re: [Lustre-discuss] Support for vanilla kernels in lustre servers

2009-10-31 Thread Andreas Dilger
course these kernels also work for different distributions, > so instead > of going through the pain to port Lustre to Ubuntu or Debian > kernels, I simply > started to create debian packages for the RHEL5 kernels Yes, I also use different kernels on different distros (SLES10 kern

Re: [Lustre-discuss] Bad distribution of files among OSTs

2009-10-30 Thread Andreas Dilger
object IDs that are very large, and then in bug 21244 I've written a small program that dumps the MDS inode number from the specified objects. You can then use "debugfs -c -R "ncheck {list of inode numbers} /dev/$ {mdsdev}" on the MDS to find the pathnames of those f

Re: [Lustre-discuss] typical hardware...

2009-10-30 Thread Andreas Dilger
s from each OSS, and 64TB of usable space (after RAID-6, journaling, and hot spares are taken into account). > Also, the roadmap for Lustre used to include "raid" personalities > for OSS's. has that functionality been deferred or dropped > altogether, or is it still i

Re: [Lustre-discuss] [Lustre-Performance] Lustre performance on SLES 11 x86-64

2009-10-30 Thread Andreas Dilger
erformance. Secondly, bs=1k is also a good way to add a lot of overhead to the IO. At a minimum use bs=4k (to match the client page size). Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustr

Re: [Lustre-discuss] Small files performance

2009-10-30 Thread Andreas Dilger
ly also improve performance, since there is RAID parity overhead for writing small chunks of data to disk. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@li

Re: [Lustre-discuss] Multihoned Problem, can mount o2ib but not tcp

2009-10-30 Thread Andreas Dilger
cience > Oak Ridge National Laboratory > (865) 241-6602 office > > ___ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Sr. S

Re: [Lustre-discuss] Support for vanilla kernels in lustre servers

2009-10-29 Thread Andreas Dilger
> over? It means that 2.6.22 is very old and spending our tester's time on this kernel is not very productive. The patches are still in CVS. We are working on adding in support for FC11 (2.6.30 AFAIR), which is fairly close to vanilla "stable" series, but it isn't finished

Re: [Lustre-discuss] vbox and ofed/IB

2009-10-26 Thread Andreas Dilger
> virtualized, or perhaps passed through to a VBox VM. > > Once that is taken care of, as far as Lustre is concerned, it should > be > business as usual. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.

Re: [Lustre-discuss] rsync --sparse actually work on RHEL5?

2009-10-25 Thread Andreas Dilger
n after-thought with getfattr. > > Tar requires the more clunky getfattr afterwards because this is RHEL5 > and RHEL5's tar doesn't appear to have EA support in it. We actually have a patched tar from FC11(?) tar-1.19 that has --xattr support in it. Available at the download

Re: [Lustre-discuss] 1.8.1 test setup achieved, what about maximum mdt size

2009-10-23 Thread Andreas Dilger
On 2009-10-23, at 03:51, Bernd Schubert wrote: > On Tuesday 20 October 2009, Andreas Dilger wrote: >> On 18-Oct-09, at 16:04, Piotr Wadas wrote: >>> Now, I did a simple count of MDT size as described in lustre 1.8.1 >>> manual, >>> and setup mdt as recommend

Re: [Lustre-discuss] more on MGS and MDT separation

2009-10-23 Thread Andreas Dilger
sn't need very much space. Maybe someone with access to ORNL Spider (largest Lustre filesystem) can comment on how much space is used in the /CONFIGS/ directory? Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _

Re: [Lustre-discuss] A question on time out

2009-10-21 Thread Andreas Dilger
not know without some external HA software that the two disk devices on the SAN are shared. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] lustre patched client subdirectory mount

2009-10-20 Thread Andreas Dilger
client to achieve the same effect? Correct. We do not yet support sub-mounts, but as you say, it is possible on Linux to do a bind mount and then umount the original mountpoint. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun

Re: [Lustre-discuss] 1.8.1 test setup achieved, what about maximum mdt size

2009-10-20 Thread Andreas Dilger
tem. > And one more thing - I use combined MGS/MDT. What's actually about MGS > size? I mean, if I use separate MGS and MDT, what size it should have, > and how management service works, regarding to its block-device > storage ? The MGS needs only some MB of space, maybe 128MB i

Re: [Lustre-discuss] Understanding of MMP

2009-10-19 Thread Andreas Dilger
g the filesystem, then waiting 10s will not help. The HA software needs to power off (STONITH) the previous node before it starts a failover. Otherwise, there may be any number of blocks still in cache or in the IO elevator that might land on the disk after the takeover, if the &quo

Re: [Lustre-discuss] lustre servers and client with different rhel release Q?

2009-10-19 Thread Andreas Dilger
her that supports RHEL 5.1 or not. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] Slow read performance across OSSes

2009-10-17 Thread Andreas Dilger
Dual 3ware 9550SX cards with 7+1 RAID 5 across 400GB WD SATA >> drives. >> Two OST/OSS: 2TB. Configured as LVM. 1 and 4MB stripe size tried. >> Client: Dual quad-core 2.5 GHz Xeon, 8GB RAM single gigabit NIC >> Network: Dedicated Cisco 2960g Gigabit switch

Re: [Lustre-discuss] Problem re-mounting Lustre on an other node

2009-10-14 Thread Andreas Dilger
nks. Are you sure you are mounting the OSTs with type "lustre" instead of "ldiskfs"? I see the above Lustre messages on my system a few seconds after the LDISKFS messages are printed. If you are using MMP (which you should be, on an automated failover config) it will add 1

Re: [Lustre-discuss] Memory (?) problem with 1.8.1

2009-10-13 Thread Andreas Dilger
nically increasing "buffers" condition >> with non-Lustre I/O on systems running the Lustre 1.8.1 kernel >> (2.6.18-128.1.14.el5_lustre.1.8.1), > > Indeed. The using up of memory by the buffer cache is a standard > (i.e. > non-Lustre-specific) feature, and you will

Re: [Lustre-discuss] lfsck

2009-10-13 Thread Andreas Dilger
can see that lfsck change timestamps on db files > so maybe I don't have to rebuild them? I can't find anything about > that in the manual or mailing list. You need to re-create the databases from Lustre to verify the cleanup. Cheers, Andreas -- Andreas Dilger Sr. Staf

Re: [Lustre-discuss] Is there a way to set lru_size and have it stick?

2009-10-13 Thread Andreas Dilger
lprocfs was disabled until fixed. Until now, there was no reason to change that code, but it makes sense to fix that now... Could you file a bug on this? Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___

Re: [Lustre-discuss] Is there a way to set lru_size and have it stick?

2009-10-09 Thread Andreas Dilger
a reboot. "lctl set_param" is only for temporary tunable setting. You can use "lctl conf_param" to set a permanent tunable. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lust

Re: [Lustre-discuss] Lustre-discuss Digest, Vol 45, Issue 6

2009-10-07 Thread Andreas Dilger
to shut down your MDS, make sure your remote DRBD copy is up-to-date, then reformat the local storage into RAID-1+0, copy the remote DRBD mirror back to the local system, and then reformat the remote DRBD storage to RAID-1+0 also and copy it there. Cheers, Andreas -- Andreas Dilger Sr. Staff Engin

Re: [Lustre-discuss] Groups and Use of newgrp

2009-10-06 Thread Andreas Dilger
Is the MDS configured to have the supplementary groups upcall running? Does the MDS list the same /etc/groups (or LDAP or whatever) as the clients? I believe this is in the manual. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Micro

Re: [Lustre-discuss] strange performance with POSIX file capabilities

2009-10-06 Thread Andreas Dilger
erformance of only some of the OSTs. I assume they are otherwise identical? The only thing I can imagine is that this option is related to SELinux and has some overhead in getting extended attributes, but even then the xattrs are only stored on the MDS so this would hurt all OSTs uniformly. Cheers

Re: [Lustre-discuss] Read/Write performance problem

2009-10-06 Thread Andreas Dilger
s contending with the reads to restart. As a general rule, avoiding unnecessary IO (i.e. reading back data that was just written) reduces the time that the application is not doing useful work (i.e. computing). Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of

Re: [Lustre-discuss] OST Pools

2009-10-05 Thread Andreas Dilger
"lfs setstripe". Also, any new files/directories created outside of "newdirectory" will use the default pool, which includes all OSTs. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. __

Re: [Lustre-discuss] Deactivated OSTs and object placement

2009-10-03 Thread Andreas Dilger
ly easily by doing "ls -li {filename}" before and after your test, and if the inode number is different then your test has created a new file. Lustre today will unfortunately not restripe a file after it is created. In Lustre 2.1 or so we will start having the ability to restripe a file

Re: [Lustre-discuss] Meaning or function of file in ldiskfs mounted MDS/OST

2009-09-30 Thread Andreas Dilger
t; detail in this mailing list few days ago but no one reply us ) > > So, could any one please tell me exactly what the meaning or functions of > each file ( CATALOGS, CONFIGS) in it is ? You can just delete the CATALOGS file if it is corrupted, it will be recreated. Cheers, Andreas -

Re: [Lustre-discuss] multi-homed issues

2009-09-30 Thread Andreas Dilger
can automatically pick the right LNET network based on their subnet address. > Andreas Dilger wrote: >> On Sep 29, 2009 16:05 +1000, Philip Manuel wrote: >> >>> Hi we would like the lustre servers available to two networks, one on >>> eth0 (192.168.1.0/24) the

Re: [Lustre-discuss] iSCSI Integration Solutions

2009-09-29 Thread Andreas Dilger
opment I could probably help you with debugging this code. > It seems like this is possible, but I am missing how to bridge between > say a Luster client, connecting to an exported Luster FS and re-offering > that file systems through iSCSI. This is something that probably already exists in

Re: [Lustre-discuss] Unable to write to filesystem (device full)

2009-09-29 Thread Andreas Dilger
thing like this. Even if you don't need it, then you have a backup :-). Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] multi-homed issues

2009-09-28 Thread Andreas Dilger
___ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-disc

Re: [Lustre-discuss] recover OST data just using ./ROOT/ and ea.bak?

2009-09-28 Thread Andreas Dilger
scuss mailing list >> Lustre-discuss@lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > ___ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

<    5   6   7   8   9   10   11   12   13   14   >