[Lustre-discuss] LustreErrors on mgs/mdt when accessing files
Hey, After a test on my freshly installed testcluster with lustre 1.6.7 I saw some errors in our logfiles. I've basically created plenty of files with: i=1; while true; do touch $i; echo $i > $i; i=$(($i+1)); done and tried to delete them later with: lfs find . | xargs rm Many files are deleted properly, but after a while lfs find stated: --- [...] warning: cb_find_init: ./3933 does not exist: No such file or directory (2) warning: cb_find_init: ./2873 does not exist: No such file or directory (2) warning: cb_find_init: ./4126 does not exist: No such file or directory (2) [...] --- At the same time this shows up on the mgs/mdt server in dmesg: --- LustreError: 2493:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 143 previous similar messages LustreError: 2444:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error (-2) r...@dcde5200 x21558/t0 o34->50bef30a-7a07- d9da-81c5-8fb613d6b...@net_0x2c0a80103_uuid:0/0 lens 312/128 e 0 to 0 dl 1237193648 ref 1 fl Interpret:/0/0 rc -2/0 LustreError: 2444:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 2216 previous similar messages LustreError: 2386:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error (-2) r...@df8c3600 x32012/t0 o34->50bef30a-7a07- d9da-81c5-8fb613d6b...@net_0x2c0a80103_uuid:0/0 lens 312/128 e 0 to 0 dl 1237193799 ref 1 fl Interpret:/0/0 rc -2/0 LustreError: 2386:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 542 previous similar messages LustreError: 2386:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing error (-2) r...@d0c53000 x126145/t0 o34->50bef30a-7a07- d9da-81c5-8fb613d6b...@net_0x2c0a80103_uuid:0/0 lens 312/128 e 0 to 0 dl 1237194056 ref 1 fl Interpret:/0/0 rc -2/0 LustreError: 2386:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 3436 previous similar messages -- Any hints what was going wrong here and why there was no errors when creating these files? Greetings Patrick -- Patrick Winnertz Tel.: +49 (0) 2161 / 4643 - 0 credativ GmbH, HRB Mönchengladbach 12080 Hohenzollernstr. 133, 41061 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] LustreErrors on mgs/mdt when accessing files
Hi Patrik, On Mon, 2009-03-16 at 11:17 +0100, Patrick Winnertz wrote: > Hey, > > After a test on my freshly installed testcluster with lustre 1.6.7 I saw some > errors in our logfiles. > > I've basically created plenty of files with: i=1; while true; do touch $i; > echo > $i > $i; i=$(($i+1)); done > and tried to delete them later with: lfs find . | xargs rm > Many files are deleted properly, but after a while lfs find stated: > --- > [...] > warning: cb_find_init: ./3933 does not exist: No such file or directory (2) > warning: cb_find_init: ./2873 does not exist: No such file or directory (2) > warning: cb_find_init: ./4126 does not exist: No such file or directory (2) > [...] > --- Is this error replicated? can you replicate this with start debug daemon (lctl debug_daemon ) and set lnet.debug=-1 / lnet.subsystem_debug=-1 ? I have one similar report before - but not have debug logs for investigate. Thanks. -- Alex Lyashkov Lustre Group, Sun Microsystems ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] LustreErrors on mgs/mdt when accessing files
Hey, > Is this error replicated? can you replicate this with start debug daemon > (lctl debug_daemon ) and set lnet.debug=-1 / > lnet.subsystem_debug=-1 ? As I was not sure where you want me to set this I've uploaded two debug logs (one from the client and one from the mgs/mdt server). http://www.credativ.com/~pwi/lustre-debug-client # from client http://www.credativ.com/~pwi/lustre-debug-mgs # from server Please wait a bit for downloading the client logfile it's quite huge (~250MB) the server logfile is complete. > I have one similar report before - but not have debug logs for > investigate. I hope this helps to sort this out. If you need more informations please ask. Greetings Patrick -- Patrick Winnertz Tel.: +49 (0) 2161 / 4643 - 0 credativ GmbH, HRB Mönchengladbach 12080 Hohenzollernstr. 133, 41061 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] LustreErrors on mgs/mdt when accessing files
Hi On Mon, 2009-03-16 at 14:19 +0100, Patrick Winnertz wrote: > Hey, > > > Is this error replicated? can you replicate this with start debug daemon > > (lctl debug_daemon ) and set lnet.debug=-1 / > > lnet.subsystem_debug=-1 ? > As I was not sure where you want me to set this I've uploaded two debug logs > (one from the client and one from the mgs/mdt server). > > http://www.credativ.com/~pwi/lustre-debug-client # from client > http://www.credativ.com/~pwi/lustre-debug-mgs # from server > > Please wait a bit for downloading the client logfile it's quite huge (~250MB) > the server logfile is complete. looks something wrong with permission: $ wget http://www.credativ.com/~pwi/lustre-debug-client --2009-03-17 07:51:24-- http://www.credativ.com/~pwi/lustre-debug-client Resolving www.credativ.com... 88.198.32.163 Connecting to www.credativ.com|88.198.32.163|:80... connected. HTTP request sent, awaiting response... 302 Found Location: http://www.credativ.com/404.html [following] -- Alex Lyashkov Lustre Group, Sun Microsystems ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] LustreErrors on mgs/mdt when accessing files
Hey, > looks something wrong with permission: > $ wget http://www.credativ.com/~pwi/lustre-debug-client > --2009-03-17 07:51:24-- > http://www.credativ.com/~pwi/lustre-debug-client > Resolving www.credativ.com... 88.198.32.163 > Connecting to www.credativ.com|88.198.32.163|:80... connected. > HTTP request sent, awaiting response... 302 Found > Location: http://www.credativ.com/404.html [following] Sorry for this, This is fixed now. Greetings Patrick -- Patrick Winnertz Tel.: +49 (0) 2161 / 4643 - 0 credativ GmbH, HRB Mönchengladbach 12080 Hohenzollernstr. 133, 41061 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] LustreErrors on mgs/mdt when accessing files
Hi Patrick, On Tue, 2009-03-17 at 08:51 +0100, Patrick Winnertz wrote: > Hey, > > looks something wrong with permission: > > $ wget http://www.credativ.com/~pwi/lustre-debug-client > > --2009-03-17 07:51:24-- > > http://www.credativ.com/~pwi/lustre-debug-client > > Resolving www.credativ.com... 88.198.32.163 > > Connecting to www.credativ.com|88.198.32.163|:80... connected. > > HTTP request sent, awaiting response... 302 Found > > Location: http://www.credativ.com/404.html [following] > Sorry for this, This is fixed now. > logs download in progress, i look to they later today or tomorrow. -- Alex Lyashkov Lustre Group, Sun Microsystems ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss