Re: [Lustre-discuss] clients gets EINTR from time to time
Dear list, still investigating on this issue, I am now struggling with debugging.. The issue arose once more yesterday, so I started to look at it deeper and decided that the trace debug should be written to disk using debug_daemon. Alas, debugging with only the trace debug active spits more than 100 MB/s worth of log ! (yes these are busy clients)... I've tried several strategies like using debug_kernel from a cron job, or while watching my products error log, but even there dk would dump 70MB worth of data representing less that one second of debug log... So chances for me to trace the signal seems looow. Is there any debug flag less verbose but that may include the signal I'm looking for ? Given John's answers could I maybe use /proc/sys/lustre/dump_on_timeout to dump the log only when timeout happens, but this will work only if my problem is matching what John can reproduce. Please also note that I've looked around for abnormal threads_started numbers, it is everywhere at the same value than threads_min, except for one mdt entry which is at thread_min+1... Regards weboramalineFrançois Chassaing Directeur Technique - CTO - Mail Original - De: John Hammond jhamm...@tacc.utexas.edu À: Andreas Dilger adil...@whamcloud.com Cc: lustre-discuss@lists.lustre.org Envoyé: Vendredi 25 Février 2011 21h16:36 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne Objet: Re: [Lustre-discuss] clients gets EINTR from time to time On 02/25/2011 11:39 AM, Andreas Dilger wrote: On 2011-02-25, at 6:28, Brian J. Murrell br...@whamcloud.com wrote: On 11-02-25 06:18 AM, Francois wrote: I continue to parse debug logs and keep them posted. I don't understand why you don't just fix your application to handle a perfectly valid and expected condition (that it's currently not handling) instead of wasting time trying to find the cause of the expected condition. Even if you find it, it's likely not a bug and not something that can/will be fixed. It's your application that needs to be fixed. In all fairness Brian, it isn't always possible to fix an application like you suggest. It might be commercial (binary only), it might be complex code using 3rd party libraries to do the IO that would lose support if modifed, etc. I think the first action to debug this is to run on the client with lctl set_param debug=+trace or =~0 which will enable function entry/exit tracing in Lustre. Then when the problem us hit run lctl dk /tmp/debug to dump the Lustre debug log, and search for -4 (which is -EINTR) to see where this error is first appearing. At that point we can make a determination where the source of the error is, and if it is Lustre's fault. I know at one time there was a related problem in the l_wait_event() macro that was improperly masking signals, but I thought it was fixed by 1.8.5. Setting aside the moral question of which calls should be interruptible, I think that the handling of the LUSTRE_FATAL_SIGS (defined in lustre_lib.h to be SIGKILL, SIGINT, SIGTERM, SIGQUIT, SIGALRM) is slightly broken. Under certain situations, Lustre will return -EINTR although no signals were delivered. That's probably not the end of the world for most applications, but OTOH I don't think anybody assumes that -EINTR will be delivered spuriously. Consider the following sequence: 1) Process P has a Lustre file F open. 2) P has SIGALRM pending (but blocked). 3) P starts to writing to F and ends up sleeping in (something like): sys_write() ... ll_extent_lock() ... osc_enqueue() ... ptlrpc_queue_wait(). 4) The OST does not respond to the request before the deadline, so l_wait_event() replaces the signal mask of P with the LUSTRE_FATAL_SIGS, notices that SIGALRM is now deliverable, restores the signal mask of P, and ptlrpc_queue_wait() returns -EINTR. 5) P is exiting from sys_write(), SIGALRM is blocked (but still pending) so it doesn't get delivered. 6) P spuriously returns -EINTR from sys_write(). I can reproduce this on 1.8.5/RHEL 5.5. If the goal is to emulate NFS's interruptibility during congestion then returning -ERESTARTSYS would be more appropriate. Also, it might be worthwhile to make this extra interruptibility a mount flag, as NFS does. Best, John -- John L. Hammond, Ph.D. TACC, The University of Texas at Austin jhamm...@tacc.utexas.edu ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
Thanks, but anyway, logs on the MDS/MGS does not show evicted client of any kind. Also, the log output by lctl debug_kernel on clients does not show much, I can only see in there the last administrative actions I've taken (such as setting striping policy on a directory, creating a new server pool, ...) and four unrelated (because not happening at my problem hours) Dropping PUT from I continue to parse debug logs and keep them posted. Thanks weboramalineFrançois Chassaing Directeur Technique - CTO - Mail Original - De: Kevin Van Maren kevin.van.ma...@oracle.com À: DEGREMONT Aurelien aurelien.degrem...@cea.fr Cc: Francois Chassaing f...@weborama.com, lustre-discuss@lists.lustre.org Envoyé: Jeudi 24 Février 2011 18h43:25 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne Objet: Re: [Lustre-discuss] clients gets EINTR from time to time No, in case of an eviction or IO errors, EIO is returned to the application, not EINTR. Kevin DEGREMONT Aurelien wrote: Hello From my understanding, Lustre can return EINTR for some I/O error cases. I think that when a client gets evicted in the middle of one of its RPC, it can returns EINTR to the caller. Is this can explain your issue? Can your verify your clients where not evicted at the same time? Aurélien Francois Chassaing a écrit : OK, thanks it makes it more clear. I indeed messed up my mind (and words) between signals and error return codes. I did understood that the write()/pwrite() system was returning the EINTR error code because it received a signal, but I supposed that the signal was sent because of an error condition somewhere in the FS. This is where I now think I'm wrong. As for your questions : - I have to mention that I always had had this issue, and this is why I've upgraded from 1.8.4 to 1.8.5, hoping this would solve it. - I will try to have that SA_RESTART flag set in the app... if I can find where the signal handler is set. - How can I see that lustre is returning EINTR for any other reason ? As I said no logs shows nothing neither on MDS or OSSs, but I didn't go through examining lctl debug_kernel yet... which I'm going to do right away... my last question is : how can I tell which signal I am receiving ? because my app doesn't say, it just dumps outs the write/pwrite error code. And if there is no signal handler, then it should follow the standard actions (as of man 7 signal). On the other hand, my app does not stop or dump core, and is not ignored, so it has to be handled in the code. Correct me if I'm wrong... At that point, you realize that I didn't write the app, nor am I a good Linux guru ;-) Tnaks a lot. weborama lineFrançois Chassaing Directeur Technique - CTO - Mail Original - De: Ken Hornstein k...@cmf.nrl.navy.mil À: Francois Chassaing f...@weborama.com Cc: lustre-discuss@lists.lustre.org Envoyé: Jeudi 24 Février 2011 15h54:24 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne Objet: Re: [Lustre-discuss] clients gets EINTR from time to time OK, the app is used to deal with standard disks, that is why it is not handling the EINTR signal propoerly. I think you're misunderstanding what a signal is in the Unix sense. EINTR isn't a signal; it's a return code from the write() system call that says, Hey, you got a signal in the middle of this write() call and it didn't complete. It doesn't mean that there was an error writing the file; if that was happening, you'd get a (presumably different) error code. Signals can be sent by the operating system, but those signals are things like SIGSEGV, which basically means, you're program screwed up. Programs can also send signals to each other, with kill(2) and the like. Now, NORMALLY systems calls like write() are interrupted by signals when you're writing to slow devices, like network sockets. According to the signal(7) man page, disks are not normally considered slow devices, so I can understand the application not being used to handling this. And you know, now that I think about it I'm not even sure that network filesystems SHOULD allow I/O system calls to be interrupted by signals ... I'd have to think more about it. I suspect what happened is that something changed between 1.8.5 and the previous version of Lustre that you were using that allowed some operations to be interruptable by signals. Some things to try: - Check to see if you are, in fact, receiving a signal in your application and Lustre isn't returning EINTR for some other reason. - If you are receiving a signal, when you set the signal handler for it you could use the SA_RESTART flag to restart the interrupted I/O; I think that would make everything work like it did before. --Ken ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman
Re: [Lustre-discuss] clients gets EINTR from time to time
On 11-02-25 06:18 AM, Francois Chassaing wrote: Thanks, but anyway, logs on the MDS/MGS does not show evicted client of any kind. Also, the log output by lctl debug_kernel on clients does not show much, I can only see in there the last administrative actions I've taken (such as setting striping policy on a directory, creating a new server pool, ...) and four unrelated (because not happening at my problem hours) Dropping PUT from I continue to parse debug logs and keep them posted. I don't understand why you don't just fix your application to handle a perfectly valid and expected condition (that it's currently not handling) instead of wasting time trying to find the cause of the expected condition. Even if you find it, it's likely not a bug and not something that can/will be fixed. It's your application that needs to be fixed. b. -- Brian J. Murrell Senior Software Engineer Whamcloud, Inc. signature.asc Description: OpenPGP digital signature ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
Maybe, once you'll explain to me why it is a perfectly expected condition. Because this I still don't get. How can it be expected that a file cannot be written because of an interrupted system call, when all conditions are apparently met to write successfully to this file : a single client writing to a single file, no locking or concurrent access from other clients, no client eviction shown, no hardware failure, no nothing. Please remember that this application traditionnaly deals with standard disks which DO NOT get EINTR except on error conditions... Regards weboramalineFrançois Chassaing Directeur Technique - CTO - Mail Original - De: Brian J. Murrell br...@whamcloud.com À: lustre-discuss@lists.lustre.org Envoyé: Vendredi 25 Février 2011 14h28:02 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne Objet: Re: [Lustre-discuss] clients gets EINTR from time to time On 11-02-25 06:18 AM, Francois Chassaing wrote: Thanks, but anyway, logs on the MDS/MGS does not show evicted client of any kind. Also, the log output by lctl debug_kernel on clients does not show much, I can only see in there the last administrative actions I've taken (such as setting striping policy on a directory, creating a new server pool, ...) and four unrelated (because not happening at my problem hours) Dropping PUT from I continue to parse debug logs and keep them posted. I don't understand why you don't just fix your application to handle a perfectly valid and expected condition (that it's currently not handling) instead of wasting time trying to find the cause of the expected condition. Even if you find it, it's likely not a bug and not something that can/will be fixed. It's your application that needs to be fixed. b. -- Brian J. Murrell Senior Software Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
I don't understand why you don't just fix your application to handle a perfectly valid and expected condition (that it's currently not handling) instead of wasting time trying to find the cause of the expected condition. Even if you find it, it's likely not a bug and not something that can/will be fixed. It's your application that needs to be fixed. To be fair ... normally disk I/O operations are not interruptable by signals, so it's not an unreasonable behavior on the part of an application. I did check POSIX, and it doesn't say that behavior is restricted only to network sockets, so yeah, it's TECHNICALLY allowable behavior according to the standard (although the Linux manpage for signal(7) says that it will not happen). But honestly, I've seen plenty of cases where applications handle this for network I/O; it's normal, everyone knows it will happen there. But for _disk_ I/O? Never seen it done. I'm not saying that there are no applications that handle this case, but it's certainly very uncommon. I freely admit that network filesystems sort of mix the concepts of network socket and disk I/O together, and what is the right behavior is unclear. But calling this perfectly valid and expected is not quite accurate. It would be interesting to see what other network filesystems do under the same circumstances. --Ken ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
Hi. I think it would help if you knew what the signal was. Do you have that yet? I have a report from a user that is is getting EINTR when a SIGALRM goes off on his write(). It isn't unexpected to get SIGALRM because he called the alarm, but he also has SA_RESTART set. I can't remember whose responsibility it is to restart the call, syscall or whereever, but it seems that someone is dropping the ball because if EINTR is returned then SA_RESTART didn't seem to do the trick, right? Thanks, -Cory On 2/25/2011 8:00 AM, Ken Hornstein wrote: I don't understand why you don't just fix your application to handle a perfectly valid and expected condition (that it's currently not handling) instead of wasting time trying to find the cause of the expected condition. Even if you find it, it's likely not a bug and not something that can/will be fixed. It's your application that needs to be fixed. To be fair ... normally disk I/O operations are not interruptable by signals, so it's not an unreasonable behavior on the part of an application. I did check POSIX, and it doesn't say that behavior is restricted only to network sockets, so yeah, it's TECHNICALLY allowable behavior according to the standard (although the Linux manpage for signal(7) says that it will not happen). But honestly, I've seen plenty of cases where applications handle this for network I/O; it's normal, everyone knows it will happen there. But for _disk_ I/O? Never seen it done. I'm not saying that there are no applications that handle this case, but it's certainly very uncommon. I freely admit that network filesystems sort of mix the concepts of network socket and disk I/O together, and what is the right behavior is unclear. But calling this perfectly valid and expected is not quite accurate. It would be interesting to see what other network filesystems do under the same circumstances. --Ken ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
I have a report from a user that is is getting EINTR when a SIGALRM goes off on his write(). It isn't unexpected to get SIGALRM because he called the alarm, but he also has SA_RESTART set. I can't remember whose responsibility it is to restart the call, syscall or whereever, but it seems that someone is dropping the ball because if EINTR is returned then SA_RESTART didn't seem to do the trick, right? I would agree with you on that one; if you're setting SA_RESTART then you shouldn't ever get EINTR. It looks like what should be happening is that if you get interrupted the system call should return ERESTARTSYS and then after the signal handler is done the system call should be re-run for you by the signal handling code. I see that at least for some cases, Lustre will use ERESTARTSYS; just a guess, but maybe somewhere Lustre is returning EINTR itself instead of returning ERESTARTSYS? --Ken ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
On 02/25/2011 11:39 AM, Andreas Dilger wrote: On 2011-02-25, at 6:28, Brian J. Murrell br...@whamcloud.com wrote: On 11-02-25 06:18 AM, Francois wrote: I continue to parse debug logs and keep them posted. I don't understand why you don't just fix your application to handle a perfectly valid and expected condition (that it's currently not handling) instead of wasting time trying to find the cause of the expected condition. Even if you find it, it's likely not a bug and not something that can/will be fixed. It's your application that needs to be fixed. In all fairness Brian, it isn't always possible to fix an application like you suggest. It might be commercial (binary only), it might be complex code using 3rd party libraries to do the IO that would lose support if modifed, etc. I think the first action to debug this is to run on the client with lctl set_param debug=+trace or =~0 which will enable function entry/exit tracing in Lustre. Then when the problem us hit run lctl dk /tmp/debug to dump the Lustre debug log, and search for -4 (which is -EINTR) to see where this error is first appearing. At that point we can make a determination where the source of the error is, and if it is Lustre's fault. I know at one time there was a related problem in the l_wait_event() macro that was improperly masking signals, but I thought it was fixed by 1.8.5. Setting aside the moral question of which calls should be interruptible, I think that the handling of the LUSTRE_FATAL_SIGS (defined in lustre_lib.h to be SIGKILL, SIGINT, SIGTERM, SIGQUIT, SIGALRM) is slightly broken. Under certain situations, Lustre will return -EINTR although no signals were delivered. That's probably not the end of the world for most applications, but OTOH I don't think anybody assumes that -EINTR will be delivered spuriously. Consider the following sequence: 1) Process P has a Lustre file F open. 2) P has SIGALRM pending (but blocked). 3) P starts to writing to F and ends up sleeping in (something like): sys_write() ... ll_extent_lock() ... osc_enqueue() ... ptlrpc_queue_wait(). 4) The OST does not respond to the request before the deadline, so l_wait_event() replaces the signal mask of P with the LUSTRE_FATAL_SIGS, notices that SIGALRM is now deliverable, restores the signal mask of P, and ptlrpc_queue_wait() returns -EINTR. 5) P is exiting from sys_write(), SIGALRM is blocked (but still pending) so it doesn't get delivered. 6) P spuriously returns -EINTR from sys_write(). I can reproduce this on 1.8.5/RHEL 5.5. If the goal is to emulate NFS's interruptibility during congestion then returning -ERESTARTSYS would be more appropriate. Also, it might be worthwhile to make this extra interruptibility a mount flag, as NFS does. Best, John -- John L. Hammond, Ph.D. TACC, The University of Texas at Austin jhamm...@tacc.utexas.edu ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] clients gets EINTR from time to time
Dear list members, We are using Lustre 1.8.5 (upgraded from 1.8.4) running on 1 MGS, 3 OSS over DDR IB, and 2 patched clients mounted with the flock option. We are experiencing issues with an application that gets a EINTR when trying to write to a file. Those errors happens randomly on both clients, which makes it difficult to clearly spot the problem. So my app treats the error as if the file was full (which is the case when dealing with a normal disk) when it is not. I've tryed to change the IB switch, so it is most probably not coming from here (while it is a cheap switch). I've also tried to change the client mount options, changed the stripping policy from -1 to 1, but it did not change anything neither. And no log of any kind is helpful on MDS or OSSs. I would really appreciate pointers or suggestions to debug this issue. Thanks François CHASSAING Directeur Technique - CTO WEBORAMA ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
On 11-02-24 05:50 AM, Francois Chassaing wrote: Dear list members, Hi, We are experiencing issues with an application that gets a EINTR when trying to write to a file. If I understand that errno properly, that is to be expected. Those errors happens randomly on both clients, Well, not randomly. It happens when a signal arrives. So my app treats the error as if the file was full This is wrong. Your app is broken and needs to be fixed. I've tryed to change the IB switch, so it is most probably not coming from here (while it is a cheap switch). I've also tried to change the client mount options, changed the stripping policy from -1 to 1, but it did not change anything neither. None of this is going to resolve your problem. Yours is a problem of application programming defect, not a system fault. I would really appreciate pointers or suggestions to debug this issue. Maybe some understanding of how signals can affect system calls. A quick google found this for me: http://www.gnu.org/s/libc/manual/html_node/Interrupted-Primitives.html#Interrupted-Primitives Probably there is more detailed text out there to help you and your application programmer to handle this application programming fault better. But alas, it is an application programming problem and not a Lustre filesystem or equipment problem. b. -- Brian J. Murrell Senior Software Engineer Whamcloud, Inc. signature.asc Description: OpenPGP digital signature ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
Well, as I understand your point and I do also understand that this signal is not a malfunction, my question was regarding to the intrinsic why (and when) does this signal is sent to the client. Thnaks line weboramalineFrançois Chassaing Directeur Technique - CTO weborama.com - f...@weborama.com T : +33 (0)1 53 19 21 51 F : +33 (0)1 53 19 21 41 Weborama - 15 rue Clavel 75019 Paris - Mail Original - De: Brian J. Murrell br...@whamcloud.com À: lustre-discuss@lists.lustre.org Envoyé: Jeudi 24 Février 2011 13h17:33 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne Objet: Re: [Lustre-discuss] clients gets EINTR from time to time On 11-02-24 05:50 AM, Francois Chassaing wrote: Dear list members, Hi, We are experiencing issues with an application that gets a EINTR when trying to write to a file. If I understand that errno properly, that is to be expected. Those errors happens randomly on both clients, Well, not randomly. It happens when a signal arrives. So my app treats the error as if the file was full This is wrong. Your app is broken and needs to be fixed. I've tryed to change the IB switch, so it is most probably not coming from here (while it is a cheap switch). I've also tried to change the client mount options, changed the stripping policy from -1 to 1, but it did not change anything neither. None of this is going to resolve your problem. Yours is a problem of application programming defect, not a system fault. I would really appreciate pointers or suggestions to debug this issue. Maybe some understanding of how signals can affect system calls. A quick google found this for me: http://www.gnu.org/s/libc/manual/html_node/Interrupted-Primitives.html#Interrupted-Primitives Probably there is more detailed text out there to help you and your application programmer to handle this application programming fault better. But alas, it is an application programming problem and not a Lustre filesystem or equipment problem. b. -- Brian J. Murrell Senior Software Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
On 11-02-24 08:16 AM, Francois Chassaing wrote: Well, as I understand your point and I do also understand that this signal is not a malfunction, No, but not handling it properly is. Interpreting an EINTR as the disk must be full (i.e. a fatal error) is wrong. my question was regarding to the intrinsic why (and when) does this signal is sent to the client. That's completely up to your application. It's the way your application has been written that is determining the hows and whys of signals. b. -- Brian J. Murrell Senior Software Engineer Whamcloud, Inc. signature.asc Description: OpenPGP digital signature ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
OK, the app is used to deal with standard disks, that is why it is not handling the EINTR signal propoerly. But I assumed that Lustre is 'just' a filesystem, so applications do not need to handle access to it any other way that the usual way. Anyhow, the signal is from the OS not from the App... So it means that the OS signals the app that it has encoutered an error while trying to write to a file, and it is the source of that that I want to track down. Because this app error only arise every few days, it means that it is not a normal condition : something sowewhere in the FS causes it. Interpreting it as a fatal error is certainly a mistake, but I still don't know why I'm getting this EINTR signal from the OS... Regards weboramalineFrançois Chassaing Directeur Technique - CTO - Mail Original - De: Brian J. Murrell br...@whamcloud.com À: lustre-discuss@lists.lustre.org Envoyé: Jeudi 24 Février 2011 14h29:27 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne Objet: Re: [Lustre-discuss] clients gets EINTR from time to time On 11-02-24 08:16 AM, Francois Chassaing wrote: Well, as I understand your point and I do also understand that this signal is not a malfunction, No, but not handling it properly is. Interpreting an EINTR as the disk must be full (i.e. a fatal error) is wrong. my question was regarding to the intrinsic why (and when) does this signal is sent to the client. That's completely up to your application. It's the way your application has been written that is determining the hows and whys of signals. b. -- Brian J. Murrell Senior Software Engineer Whamcloud, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
OK, the app is used to deal with standard disks, that is why it is not handling the EINTR signal propoerly. I think you're misunderstanding what a signal is in the Unix sense. EINTR isn't a signal; it's a return code from the write() system call that says, Hey, you got a signal in the middle of this write() call and it didn't complete. It doesn't mean that there was an error writing the file; if that was happening, you'd get a (presumably different) error code. Signals can be sent by the operating system, but those signals are things like SIGSEGV, which basically means, you're program screwed up. Programs can also send signals to each other, with kill(2) and the like. Now, NORMALLY systems calls like write() are interrupted by signals when you're writing to slow devices, like network sockets. According to the signal(7) man page, disks are not normally considered slow devices, so I can understand the application not being used to handling this. And you know, now that I think about it I'm not even sure that network filesystems SHOULD allow I/O system calls to be interrupted by signals ... I'd have to think more about it. I suspect what happened is that something changed between 1.8.5 and the previous version of Lustre that you were using that allowed some operations to be interruptable by signals. Some things to try: - Check to see if you are, in fact, receiving a signal in your application and Lustre isn't returning EINTR for some other reason. - If you are receiving a signal, when you set the signal handler for it you could use the SA_RESTART flag to restart the interrupted I/O; I think that would make everything work like it did before. --Ken ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
OK, thanks it makes it more clear. I indeed messed up my mind (and words) between signals and error return codes. I did understood that the write()/pwrite() system was returning the EINTR error code because it received a signal, but I supposed that the signal was sent because of an error condition somewhere in the FS. This is where I now think I'm wrong. As for your questions : - I have to mention that I always had had this issue, and this is why I've upgraded from 1.8.4 to 1.8.5, hoping this would solve it. - I will try to have that SA_RESTART flag set in the app... if I can find where the signal handler is set. - How can I see that lustre is returning EINTR for any other reason ? As I said no logs shows nothing neither on MDS or OSSs, but I didn't go through examining lctl debug_kernel yet... which I'm going to do right away... my last question is : how can I tell which signal I am receiving ? because my app doesn't say, it just dumps outs the write/pwrite error code. And if there is no signal handler, then it should follow the standard actions (as of man 7 signal). On the other hand, my app does not stop or dump core, and is not ignored, so it has to be handled in the code. Correct me if I'm wrong... At that point, you realize that I didn't write the app, nor am I a good Linux guru ;-) Tnaks a lot. weboramalineFrançois Chassaing Directeur Technique - CTO - Mail Original - De: Ken Hornstein k...@cmf.nrl.navy.mil À: Francois Chassaing f...@weborama.com Cc: lustre-discuss@lists.lustre.org Envoyé: Jeudi 24 Février 2011 15h54:24 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne Objet: Re: [Lustre-discuss] clients gets EINTR from time to time OK, the app is used to deal with standard disks, that is why it is not handling the EINTR signal propoerly. I think you're misunderstanding what a signal is in the Unix sense. EINTR isn't a signal; it's a return code from the write() system call that says, Hey, you got a signal in the middle of this write() call and it didn't complete. It doesn't mean that there was an error writing the file; if that was happening, you'd get a (presumably different) error code. Signals can be sent by the operating system, but those signals are things like SIGSEGV, which basically means, you're program screwed up. Programs can also send signals to each other, with kill(2) and the like. Now, NORMALLY systems calls like write() are interrupted by signals when you're writing to slow devices, like network sockets. According to the signal(7) man page, disks are not normally considered slow devices, so I can understand the application not being used to handling this. And you know, now that I think about it I'm not even sure that network filesystems SHOULD allow I/O system calls to be interrupted by signals ... I'd have to think more about it. I suspect what happened is that something changed between 1.8.5 and the previous version of Lustre that you were using that allowed some operations to be interruptable by signals. Some things to try: - Check to see if you are, in fact, receiving a signal in your application and Lustre isn't returning EINTR for some other reason. - If you are receiving a signal, when you set the signal handler for it you could use the SA_RESTART flag to restart the interrupted I/O; I think that would make everything work like it did before. --Ken ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
As for your questions : - I have to mention that I always had had this issue, and this is why I've upgraded from 1.8.4 to 1.8.5, hoping this would solve it. Ah, okay, I misunderstood that; my apologies. - I will try to have that SA_RESTART flag set in the app... if I can find where the signal handler is set. Searching for sigaction or signal should help there. - How can I see that lustre is returning EINTR for any other reason ? As I said no logs shows nothing neither on MDS or OSSs, but I didn't go through examining lctl debug_kernel yet... which I'm going to do right away... Weeelll ... that was just a guess on my part. I did a quick grep though the Lustre sources and saw a few places where EINTR was returned, but most of those seemed to deal with the case where I/O was interrupted (those places happened fairly far down in the stack; it wasn't clear to me that those errors would ever bubble back up to a return code to a system call). If _that_ is the issue, then tracking that down will be a challenge. my last question is : how can I tell which signal I am receiving ? because my app doesn't say, it just dumps outs the write/pwrite error code. I think your easiest way is to use strace; something like strace -e signal should do the right thing (that will only trace signals, not all system calls). And if there is no signal handler, then it should follow the standard actions (as of man 7 signal). On the other hand, my app does not stop or dump core, and is not ignored, so it has to be handled in the code. Correct me if I'm wrong... That is my understanding as well; if you don't have a signal handler installed, the default action should be taking place, and if the default action is to ignore the signal that you shouldn't be getting EINTR. But hey, I've been wrong before :-) --Ken ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
Hello From my understanding, Lustre can return EINTR for some I/O error cases. I think that when a client gets evicted in the middle of one of its RPC, it can returns EINTR to the caller. Is this can explain your issue? Can your verify your clients where not evicted at the same time? Aurélien Francois Chassaing a écrit : OK, thanks it makes it more clear. I indeed messed up my mind (and words) between signals and error return codes. I did understood that the write()/pwrite() system was returning the EINTR error code because it received a signal, but I supposed that the signal was sent because of an error condition somewhere in the FS. This is where I now think I'm wrong. As for your questions : - I have to mention that I always had had this issue, and this is why I've upgraded from 1.8.4 to 1.8.5, hoping this would solve it. - I will try to have that SA_RESTART flag set in the app... if I can find where the signal handler is set. - How can I see that lustre is returning EINTR for any other reason ? As I said no logs shows nothing neither on MDS or OSSs, but I didn't go through examining lctl debug_kernel yet... which I'm going to do right away... my last question is : how can I tell which signal I am receiving ? because my app doesn't say, it just dumps outs the write/pwrite error code. And if there is no signal handler, then it should follow the standard actions (as of man 7 signal). On the other hand, my app does not stop or dump core, and is not ignored, so it has to be handled in the code. Correct me if I'm wrong... At that point, you realize that I didn't write the app, nor am I a good Linux guru ;-) Tnaks a lot. weborama lineFrançois Chassaing Directeur Technique - CTO - Mail Original - De: Ken Hornstein k...@cmf.nrl.navy.mil À: Francois Chassaing f...@weborama.com Cc: lustre-discuss@lists.lustre.org Envoyé: Jeudi 24 Février 2011 15h54:24 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne Objet: Re: [Lustre-discuss] clients gets EINTR from time to time OK, the app is used to deal with standard disks, that is why it is not handling the EINTR signal propoerly. I think you're misunderstanding what a signal is in the Unix sense. EINTR isn't a signal; it's a return code from the write() system call that says, Hey, you got a signal in the middle of this write() call and it didn't complete. It doesn't mean that there was an error writing the file; if that was happening, you'd get a (presumably different) error code. Signals can be sent by the operating system, but those signals are things like SIGSEGV, which basically means, you're program screwed up. Programs can also send signals to each other, with kill(2) and the like. Now, NORMALLY systems calls like write() are interrupted by signals when you're writing to slow devices, like network sockets. According to the signal(7) man page, disks are not normally considered slow devices, so I can understand the application not being used to handling this. And you know, now that I think about it I'm not even sure that network filesystems SHOULD allow I/O system calls to be interrupted by signals ... I'd have to think more about it. I suspect what happened is that something changed between 1.8.5 and the previous version of Lustre that you were using that allowed some operations to be interruptable by signals. Some things to try: - Check to see if you are, in fact, receiving a signal in your application and Lustre isn't returning EINTR for some other reason. - If you are receiving a signal, when you set the signal handler for it you could use the SA_RESTART flag to restart the interrupted I/O; I think that would make everything work like it did before. --Ken ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
On 11-02-24 11:57 AM, DEGREMONT Aurelien wrote: Hello Hi, From my understanding, Lustre can return EINTR for some I/O error cases. I think that should/would be an EIO. I think that when a client gets evicted in the middle of one of its RPC, it can returns EINTR to the caller. An evicted client should get an EIO on it's I/O calls, IIRC. b. -- Brian J. Murrell Senior Software Engineer Whamcloud, Inc. signature.asc Description: OpenPGP digital signature ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] clients gets EINTR from time to time
No, in case of an eviction or IO errors, EIO is returned to the application, not EINTR. Kevin DEGREMONT Aurelien wrote: Hello From my understanding, Lustre can return EINTR for some I/O error cases. I think that when a client gets evicted in the middle of one of its RPC, it can returns EINTR to the caller. Is this can explain your issue? Can your verify your clients where not evicted at the same time? Aurélien Francois Chassaing a écrit : OK, thanks it makes it more clear. I indeed messed up my mind (and words) between signals and error return codes. I did understood that the write()/pwrite() system was returning the EINTR error code because it received a signal, but I supposed that the signal was sent because of an error condition somewhere in the FS. This is where I now think I'm wrong. As for your questions : - I have to mention that I always had had this issue, and this is why I've upgraded from 1.8.4 to 1.8.5, hoping this would solve it. - I will try to have that SA_RESTART flag set in the app... if I can find where the signal handler is set. - How can I see that lustre is returning EINTR for any other reason ? As I said no logs shows nothing neither on MDS or OSSs, but I didn't go through examining lctl debug_kernel yet... which I'm going to do right away... my last question is : how can I tell which signal I am receiving ? because my app doesn't say, it just dumps outs the write/pwrite error code. And if there is no signal handler, then it should follow the standard actions (as of man 7 signal). On the other hand, my app does not stop or dump core, and is not ignored, so it has to be handled in the code. Correct me if I'm wrong... At that point, you realize that I didn't write the app, nor am I a good Linux guru ;-) Tnaks a lot. weborama lineFrançois Chassaing Directeur Technique - CTO - Mail Original - De: Ken Hornstein k...@cmf.nrl.navy.mil À: Francois Chassaing f...@weborama.com Cc: lustre-discuss@lists.lustre.org Envoyé: Jeudi 24 Février 2011 15h54:24 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne Objet: Re: [Lustre-discuss] clients gets EINTR from time to time OK, the app is used to deal with standard disks, that is why it is not handling the EINTR signal propoerly. I think you're misunderstanding what a signal is in the Unix sense. EINTR isn't a signal; it's a return code from the write() system call that says, Hey, you got a signal in the middle of this write() call and it didn't complete. It doesn't mean that there was an error writing the file; if that was happening, you'd get a (presumably different) error code. Signals can be sent by the operating system, but those signals are things like SIGSEGV, which basically means, you're program screwed up. Programs can also send signals to each other, with kill(2) and the like. Now, NORMALLY systems calls like write() are interrupted by signals when you're writing to slow devices, like network sockets. According to the signal(7) man page, disks are not normally considered slow devices, so I can understand the application not being used to handling this. And you know, now that I think about it I'm not even sure that network filesystems SHOULD allow I/O system calls to be interrupted by signals ... I'd have to think more about it. I suspect what happened is that something changed between 1.8.5 and the previous version of Lustre that you were using that allowed some operations to be interruptable by signals. Some things to try: - Check to see if you are, in fact, receiving a signal in your application and Lustre isn't returning EINTR for some other reason. - If you are receiving a signal, when you set the signal handler for it you could use the SA_RESTART flag to restart the interrupted I/O; I think that would make everything work like it did before. --Ken ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss