On Fri, 2010-04-16 at 10:49 -0700, Andreas Dilger wrote:
On 2010-04-16, at 01:27, Christos Theodosiou wrote:
our lustre installation uses two failover MDSes, which serve 10
file-systems. We recently upgraded from 1.8.1.1 to 1.8.2 version.
By monitoring the MDSes I noticed that we get
Dear all,
We envolve in a same situation as problem discussed here:
http://lists.lustre.org/pipermail/lustre-discuss/2009-January/009512.html
One OST is set as read only the first time after it is remounted after a server
crash.
Apr 16 17:40:31 boss27 kernel: LDISKFS-fs
On Apr 18, 2010, at 11:46 AM, Bernd Schubert wrote:
You don't need to take the filesystem offline for lfsck.
You sure about that? Looking at
http://wiki.lustre.org/manual/LustreManual18_HTML/LustreRecovery.html#50598012_37365
step 1 says Stop the Lustre File System.
Also, I have
On Apr 18, 2010, at 1:14 PM, Andreas Dilger wrote:
On 2010-04-18, at 07:16, Charles Taylor wrote:
On Apr 18, 2010, at 9:35 AM, Miguel Afonso Oliveira wrote:
You are going to have to use unlink with something like this:
for file in lost_files
unlink $file
Nope. That's really no
Hi Guys,
My users are reporting some issues with memory on our lustre 1.8.1 clients.
It looks like when they submit a single job at a time the run time was about
4.5 minutes. However, when they ran multiple jobs (10 or less) on a client
with 192GB of memory on a single node the run time for each
There is a known problem with the DLM LRU size that may be affecting
you. It may be something else too. Please check /proc/
{slabinfo,meminfo} to see what is using the memory on the client.
Cheers, Andreas
On 2010-04-19, at 10:43, Jagga Soorma jagg...@gmail.com wrote:
Hi Guys,
My users
Hi,
we saw this LBUG 3 times within past week, and are puzzled of what's going on,
and how comes there's no bugzilla entry for this...
What happens is that on an OSS a request (must be read or write) expects
(according to the content of the ioobj structure) to find an array of 22 struct
Thanks for the response Andreas.
What is the known problem with the DLM LRU size? Here is what my
slabinfo/meminfo look like on one of the clients. I don't see anything out
of the ordinary:
(then again there are no jobs currently running on this system)
Thanks
-J
--
slabinfo:
..
slabinfo -
I was also going to recommend the unlink.
We have had to do this as well, the unlink worked for us. It did need to be
run with privileges for the file. (root in our case.)
--
Andrew
-Original Message-
From: lustre-discuss-boun...@lists.lustre.org
Hi all,
I'm running a small Luster system(1.8.1.1): 1 MDS, 1 OSS, 2 clients.
Each node has 1gig and infiniband (mlx4_0) with ipoib setup. I'm
trying to use IB transport.
The /etc/modprobe.conf is the same for all nodes:
--
alias eth0 e1000e
alias eth1 e1000e
alias eth2 8139too
alias
Hello Erich,
check out my bug report:
https://bugzilla.lustre.org/show_bug.cgi?id=19992
It was closed as duplicate of bug 16129, although that is probably not
correct, as 16129 is the root cause, but not the solution.
As we never observed it with 1.6.7.2 I didn't complain bug 19992 was
On 2010-04-19, at 01:41, x...@xgl.pereslavl.ru wrote:
I have 1 OST that seems like inactive device on client:
[Client] lfs df -h
UUID bytes Used Available Use% Mounted on
lustre00-MDT_UUID814.8G471.8M767.8G0% /mnt/
lustre00[MDT:0]
Christos Theodosiou wrote:
On Fri, 2010-04-16 at 10:49 -0700, Andreas Dilger wrote:
On 2010-04-16, at 01:27, Christos Theodosiou wrote:
our lustre installation uses two failover MDSes, which serve 10
file-systems. We recently upgraded from 1.8.1.1 to 1.8.2 version.
By monitoring the
On 2010-04-19, at 11:16, Jagga Soorma wrote:
What is the known problem with the DLM LRU size?
It is mostly a problem on the server, actually.
Here is what my slabinfo/meminfo look like on one of the clients.
I don't see anything out of the ordinary:
(then again there are no jobs
14 matches
Mail list logo