date:20100419

Re: [Lustre-discuss] Frequent appearence of LustreError: no handle for file close ino

2010-04-19 Thread Christos Theodosiou

On Fri, 2010-04-16 at 10:49 -0700, Andreas Dilger wrote: On 2010-04-16, at 01:27, Christos Theodosiou wrote: our lustre installation uses two failover MDSes, which serve 10 file-systems. We recently upgraded from 1.8.1.1 to 1.8.2 version. By monitoring the MDSes I noticed that we get

[Lustre-discuss] On-disk bitmap corrupted

2010-04-19 Thread Lu Wang

Dear all, We envolve in a same situation as problem discussed here: http://lists.lustre.org/pipermail/lustre-discuss/2009-January/009512.html One OST is set as read only the first time after it is remounted after a server crash. Apr 16 17:40:31 boss27 kernel: LDISKFS-fs

Re: [Lustre-discuss] Lost Files - How to remove from MDT

2010-04-19 Thread Charles Taylor

On Apr 18, 2010, at 11:46 AM, Bernd Schubert wrote: You don't need to take the filesystem offline for lfsck. You sure about that? Looking at http://wiki.lustre.org/manual/LustreManual18_HTML/LustreRecovery.html#50598012_37365 step 1 says Stop the Lustre File System. Also, I have

Re: [Lustre-discuss] Lost Files - How to remove from MDT

2010-04-19 Thread Charles Taylor

On Apr 18, 2010, at 1:14 PM, Andreas Dilger wrote: On 2010-04-18, at 07:16, Charles Taylor wrote: On Apr 18, 2010, at 9:35 AM, Miguel Afonso Oliveira wrote: You are going to have to use unlink with something like this: for file in lost_files unlink $file Nope. That's really no

[Lustre-discuss] Lustre Client - Memory Issue

2010-04-19 Thread Jagga Soorma

Hi Guys, My users are reporting some issues with memory on our lustre 1.8.1 clients. It looks like when they submit a single job at a time the run time was about 4.5 minutes. However, when they ran multiple jobs (10 or less) on a client with 192GB of memory on a single node the run time for each

Re: [Lustre-discuss] Lustre Client - Memory Issue

2010-04-19 Thread Andreas Dilger

There is a known problem with the DLM LRU size that may be affecting you. It may be something else too. Please check /proc/ {slabinfo,meminfo} to see what is using the memory on the client. Cheers, Andreas On 2010-04-19, at 10:43, Jagga Soorma jagg...@gmail.com wrote: Hi Guys, My users

[Lustre-discuss] LBUG: ost_rw_hpreq_check() ASSERTION(nb != NULL) failed

2010-04-19 Thread Erich Focht

Hi, we saw this LBUG 3 times within past week, and are puzzled of what's going on, and how comes there's no bugzilla entry for this... What happens is that on an OSS a request (must be read or write) expects (according to the content of the ioobj structure) to find an array of 22 struct

Re: [Lustre-discuss] Lustre Client - Memory Issue

2010-04-19 Thread Jagga Soorma

Thanks for the response Andreas. What is the known problem with the DLM LRU size? Here is what my slabinfo/meminfo look like on one of the clients. I don't see anything out of the ordinary: (then again there are no jobs currently running on this system) Thanks -J -- slabinfo: .. slabinfo -

Re: [Lustre-discuss] Lost Files - How to remove from MDT

2010-04-19 Thread Lundgren, Andrew

I was also going to recommend the unlink. We have had to do this as well, the unlink worked for us. It did need to be run with privileges for the file. (root in our case.) -- Andrew -Original Message- From: lustre-discuss-boun...@lists.lustre.org

[Lustre-discuss] Lustre MDS unable to start

2010-04-19 Thread neutron

Hi all, I'm running a small Luster system(1.8.1.1): 1 MDS, 1 OSS, 2 clients. Each node has 1gig and infiniband (mlx4_0) with ipoib setup. I'm trying to use IB transport. The /etc/modprobe.conf is the same for all nodes: -- alias eth0 e1000e alias eth1 e1000e alias eth2 8139too alias

Re: [Lustre-discuss] LBUG: ost_rw_hpreq_check() ASSERTION(nb != NULL) failed

2010-04-19 Thread Bernd Schubert

Hello Erich, check out my bug report: https://bugzilla.lustre.org/show_bug.cgi?id=19992 It was closed as duplicate of bug 16129, although that is probably not correct, as 16129 is the root cause, but not the solution. As we never observed it with 1.6.7.2 I didn't complain bug 19992 was

Re: [Lustre-discuss] Inactive OST

2010-04-19 Thread Andreas Dilger

On 2010-04-19, at 01:41, x...@xgl.pereslavl.ru wrote: I have 1 OST that seems like inactive device on client: [Client] lfs df -h UUID bytes Used Available Use% Mounted on lustre00-MDT_UUID814.8G471.8M767.8G0% /mnt/ lustre00[MDT:0]

Re: [Lustre-discuss] Frequent appearence of LustreError: no handle for file close ino

2010-04-19 Thread Dmitry Zogin

Christos Theodosiou wrote: On Fri, 2010-04-16 at 10:49 -0700, Andreas Dilger wrote: On 2010-04-16, at 01:27, Christos Theodosiou wrote: our lustre installation uses two failover MDSes, which serve 10 file-systems. We recently upgraded from 1.8.1.1 to 1.8.2 version. By monitoring the

Re: [Lustre-discuss] Lustre Client - Memory Issue

2010-04-19 Thread Andreas Dilger

On 2010-04-19, at 11:16, Jagga Soorma wrote: What is the known problem with the DLM LRU size? It is mostly a problem on the server, actually. Here is what my slabinfo/meminfo look like on one of the clients. I don't see anything out of the ordinary: (then again there are no jobs

Re: [Lustre-discuss] Frequent appearence of LustreError: no handle for file close ino

[Lustre-discuss] On-disk bitmap corrupted

Re: [Lustre-discuss] Lost Files - How to remove from MDT

Re: [Lustre-discuss] Lost Files - How to remove from MDT

[Lustre-discuss] Lustre Client - Memory Issue

Re: [Lustre-discuss] Lustre Client - Memory Issue

[Lustre-discuss] LBUG: ost_rw_hpreq_check() ASSERTION(nb != NULL) failed

Re: [Lustre-discuss] Lustre Client - Memory Issue

Re: [Lustre-discuss] Lost Files - How to remove from MDT

[Lustre-discuss] Lustre MDS unable to start

Re: [Lustre-discuss] LBUG: ost_rw_hpreq_check() ASSERTION(nb != NULL) failed

Re: [Lustre-discuss] Inactive OST

Re: [Lustre-discuss] Frequent appearence of LustreError: no handle for file close ino

Re: [Lustre-discuss] Lustre Client - Memory Issue

14 matches

Site Navigation

Mail list logo

Footer information