[Lustre-discuss] LustreErrors on mgs/mdt when accessing files

2009-03-16 Thread Patrick Winnertz
Hey,

After a test on my freshly installed testcluster with lustre 1.6.7 I saw some 
errors in our logfiles. 

I've basically created plenty of files with: i=1; while true; do touch $i; echo 
$i > $i; i=$(($i+1)); done
and tried to delete them later with:  lfs find . | xargs rm
Many files are deleted properly, but after a while lfs find stated:
---
[...]
warning: cb_find_init: ./3933 does not exist: No such file or directory (2)
warning: cb_find_init: ./2873 does not exist: No such file or directory (2)
warning: cb_find_init: ./4126 does not exist: No such file or directory (2)
[...]
---
At the same time this shows up on the mgs/mdt server in dmesg:
---
LustreError: 2493:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 143 
previous similar messages
LustreError: 2444:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing 
error (-2)  r...@dcde5200 x21558/t0 o34->50bef30a-7a07-
d9da-81c5-8fb613d6b...@net_0x2c0a80103_uuid:0/0 lens 312/128 e 0 to 0 dl 
1237193648 ref 1 fl Interpret:/0/0 rc -2/0
LustreError: 2444:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 2216 
previous similar messages
LustreError: 2386:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing 
error (-2)  r...@df8c3600 x32012/t0 o34->50bef30a-7a07-
d9da-81c5-8fb613d6b...@net_0x2c0a80103_uuid:0/0 lens 312/128 e 0 to 0 dl 
1237193799 ref 1 fl Interpret:/0/0 rc -2/0
LustreError: 2386:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 542 
previous similar messages
LustreError: 2386:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing 
error (-2)  r...@d0c53000 x126145/t0 o34->50bef30a-7a07-
d9da-81c5-8fb613d6b...@net_0x2c0a80103_uuid:0/0 lens 312/128 e 0 to 0 dl 
1237194056 ref 1 fl Interpret:/0/0 rc -2/0
LustreError: 2386:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 3436 
previous similar messages
--

Any hints what was going wrong here and why there was no errors when creating 
these files?

Greetings
Patrick
-- 
Patrick Winnertz
Tel.: +49 (0) 2161 / 4643 - 0

credativ GmbH, HRB Mönchengladbach 12080
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] LustreErrors on mgs/mdt when accessing files

2009-03-16 Thread Alex Lyashkov
Hi Patrik,

On Mon, 2009-03-16 at 11:17 +0100, Patrick Winnertz wrote:
> Hey,
> 
> After a test on my freshly installed testcluster with lustre 1.6.7 I saw some 
> errors in our logfiles. 
> 
> I've basically created plenty of files with: i=1; while true; do touch $i; 
> echo 
> $i > $i; i=$(($i+1)); done
> and tried to delete them later with:  lfs find . | xargs rm
> Many files are deleted properly, but after a while lfs find stated:
> ---
> [...]
> warning: cb_find_init: ./3933 does not exist: No such file or directory (2)
> warning: cb_find_init: ./2873 does not exist: No such file or directory (2)
> warning: cb_find_init: ./4126 does not exist: No such file or directory (2)
> [...]
> ---
Is this error replicated? can you replicate this with start debug daemon
(lctl debug_daemon ) and set lnet.debug=-1 /
lnet.subsystem_debug=-1  ?

I have one similar report before - but not have debug logs for
investigate.

Thanks.

-- 
Alex Lyashkov 
Lustre Group, Sun Microsystems

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] LustreErrors on mgs/mdt when accessing files

2009-03-16 Thread Patrick Winnertz
Hey,

> Is this error replicated? can you replicate this with start debug daemon
> (lctl debug_daemon ) and set lnet.debug=-1 /
> lnet.subsystem_debug=-1  ?
As I was not sure where you want me to set this I've uploaded two debug logs 
(one from the client and one from the mgs/mdt server).

http://www.credativ.com/~pwi/lustre-debug-client # from client
http://www.credativ.com/~pwi/lustre-debug-mgs # from server

Please wait a bit for downloading the client logfile it's quite huge (~250MB) 
the server logfile is complete.

> I have one similar report before - but not have debug logs for
> investigate.
I hope this helps to sort this out.
If you need more informations please ask.

Greetings
Patrick
-- 
Patrick Winnertz
Tel.: +49 (0) 2161 / 4643 - 0

credativ GmbH, HRB Mönchengladbach 12080
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] LustreErrors on mgs/mdt when accessing files

2009-03-16 Thread Alex Lyashkov
Hi

On Mon, 2009-03-16 at 14:19 +0100, Patrick Winnertz wrote:
> Hey,
> 
> > Is this error replicated? can you replicate this with start debug daemon
> > (lctl debug_daemon ) and set lnet.debug=-1 /
> > lnet.subsystem_debug=-1  ?
> As I was not sure where you want me to set this I've uploaded two debug logs 
> (one from the client and one from the mgs/mdt server).
> 
> http://www.credativ.com/~pwi/lustre-debug-client # from client
> http://www.credativ.com/~pwi/lustre-debug-mgs # from server
> 
> Please wait a bit for downloading the client logfile it's quite huge (~250MB) 
> the server logfile is complete.

looks something wrong with permission:
$ wget http://www.credativ.com/~pwi/lustre-debug-client
--2009-03-17 07:51:24--
http://www.credativ.com/~pwi/lustre-debug-client
Resolving www.credativ.com... 88.198.32.163
Connecting to www.credativ.com|88.198.32.163|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://www.credativ.com/404.html [following]


-- 
Alex Lyashkov 
Lustre Group, Sun Microsystems

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] LustreErrors on mgs/mdt when accessing files

2009-03-17 Thread Patrick Winnertz
Hey,
> looks something wrong with permission:
> $ wget http://www.credativ.com/~pwi/lustre-debug-client
> --2009-03-17 07:51:24--
> http://www.credativ.com/~pwi/lustre-debug-client
> Resolving www.credativ.com... 88.198.32.163
> Connecting to www.credativ.com|88.198.32.163|:80... connected.
> HTTP request sent, awaiting response... 302 Found
> Location: http://www.credativ.com/404.html [following]
Sorry for this, This is fixed now.

Greetings
Patrick
-- 
Patrick Winnertz
Tel.: +49 (0) 2161 / 4643 - 0

credativ GmbH, HRB Mönchengladbach 12080
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] LustreErrors on mgs/mdt when accessing files

2009-03-17 Thread Alex Lyashkov
Hi Patrick,

On Tue, 2009-03-17 at 08:51 +0100, Patrick Winnertz wrote:
> Hey,
> > looks something wrong with permission:
> > $ wget http://www.credativ.com/~pwi/lustre-debug-client
> > --2009-03-17 07:51:24--
> > http://www.credativ.com/~pwi/lustre-debug-client
> > Resolving www.credativ.com... 88.198.32.163
> > Connecting to www.credativ.com|88.198.32.163|:80... connected.
> > HTTP request sent, awaiting response... 302 Found
> > Location: http://www.credativ.com/404.html [following]
> Sorry for this, This is fixed now.
> 
logs download in progress, i look to they later today or tomorrow.

-- 
Alex Lyashkov 
Lustre Group, Sun Microsystems

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss