Re: [Bug 535583] Excessive logging by apcsmart program
2011/4/21 Michal Soltys sol...@ziu.info On 11-04-21 10:34, Arnaud Quette wrote: Hi Lupe, since we now have an apcsmart maintainer, I'm forwarding this issue to him. @Michal: could you please have a look at this issue [1], and give us your feeling? cheers, Arnaud -- [1] https://bugs.launchpad.net/bugs/535583 2011/2/15 Lupe Christoph l...@lupe-christoph.de mailto:l...@lupe-christoph.de The suggestions are pretty fine. - flushing stale input (though at driver level) Certainly. I even added some flushes earlier, but haven't touched the updateinfo and/or the functions it calls yet. I'll add it along with forthcoming patches (icanon mode and the rest). Looking at the strace, flushing post-failure might be good idea in certain cases as well. - reopening serial port If the upper layers of nut don't disallow this kind of behaviour for some reason - it's good idea as well. Should be helpful in weird cases, and at the very least wouldn't hurt at all. If it would help in this particular case, hard to say. none special. we already have to somehow do so with USB devices. - smartmode() TBH, I'm not sure why it diligently tries to enter SM 5 times. Pre-emptive flush + 'Y' + reasonable delay (icanon or not) should be all that is necessary. If we don't succeed, next attempt shouldn't miraculously (in theory) make much of a difference 1 second later ... Thanks for pointing out those issues. thanks for taking care of it ;-) cheers, Arnaud -- Linux / Unix Expert RD - Eaton - http://powerquality.eaton.com Network UPS Tools (NUT) Project Leader - http://www.networkupstools.org/ Debian Developer - http://www.debian.org Free Software Developer - http://arnaud.quette.free.fr/ -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to nut in Ubuntu. https://bugs.launchpad.net/bugs/535583 Title: Excessive logging by apcsmart program -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
Re: [Bug 535583] Excessive logging by apcsmart program
Hi Lupe, since we now have an apcsmart maintainer, I'm forwarding this issue to him. @Michal: could you please have a look at this issue [1], and give us your feeling? cheers, Arnaud -- [1] https://bugs.launchpad.net/bugs/535583 2011/2/15 Lupe Christoph l...@lupe-christoph.de On Tuesday, 2011-02-15 at 13:16:58 -, Arnaud Quette wrote: this is not the problem. This code is in the smartmode() function of apcsmart.c: http://svn.debian.org/wsvn/nut/trunk/drivers/apcsmart.c we see the 5 attempts to go to smart mode ('Y' command), but my aim is to understand why it is failing, and how to cleanly solve this without impacting support for other units. I found no code that does five attempts. But this code in main.c, starting on Line 618: while (!exit_flag) { struct timeval timeout; gettimeofday(timeout, NULL); timeout.tv_sec += poll_interval; upsdrv_updateinfo(); while (!dstate_poll_fds(timeout, extrafd) !exit_flag) { /* repeat until time is up or extrafd has data */ upsdrv_updateinfo() calls smartmode(). dstate_poll_fds() checks if there is any file descriptor that is available. In our case: select(7, [4 5 6], NULL, NULL, {1, 999837}) = 1 (in [4], left {1, 999835}) FD 4 is the serial line, which is passed to dstate_poll_fds() as extrafd. When there is data that can be read from the UPS no code in dstate_poll_fds() reads from extrafd, there is only code that reads from the other input FDs. The outer loop above also ignores extrafd. exit_flag is never set, so it continues. And because there is an active file descriptor, the select returns immediately (actually it takes two microseconds). The solution is to add code that reads all data from extrafd and discards it because nobody asked for it. I would also close and reopen the serial line in smartmode(). I would prepare a patch if I knew more about the I/O abstractions used in the nut driver code. Sorry. HTH, Lupe Christoph -- | It is a well-known fact in any organisation that, if you want a job| | done, you should give it to someone who is already very busy. | | Terry Pratchett, Unseen Academicals | -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to nut in Ubuntu. https://bugs.launchpad.net/bugs/535583 Title: Excessive logging by apcsmart program -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
Re: [Bug 535583] Excessive logging by apcsmart program
2011/2/15 Lupe Christoph On Monday, 2011-02-14 at 21:54:20 -, Arnaud Quette wrote: I definitely need more info! please reply to ALL: - what is the exact model and date of manufacturing? SmartUPS 300I NET. I have the serial number (GS9809283199) but no date. it seems to be a recent model. - are you sure this unit is ok? You can't prove the absence of faults. this was related to the following question... - have you really checked the cabling or made the whole (cable + UPS) work somehow (using APC's software or apcupsd)? Well, as I said this is working OK for days or weeks. Then something happens that triggers a bug in apcsmart. quickly reading back the thread, I can't find these info... - what is the meantime between occurrences of these issues? I don;t have enough data. It's in the range of weeks or months. as per your previous posts, this seemed more to be a matter of minutes / hours. - is the device reachable (using upsc for example) between issues? Sure, everything works fine. A driver debug output is really needed! I'm running it again, but no promises. Reboots are much more frequent than this misbehaviour. Note that I'm not the developer of this driver, nor have any acquaintance with APC. Same here. Though I will probably try to locate this bug if we don;t make progress with the debugging output, either because it does not tell us enough or because I don't manage to capture it. I would have thought finding the place in the code where it is trying to reset the UPS connection wouldn't be this hard. this is not the problem. This code is in the smartmode() function of apcsmart.c: http://svn.debian.org/wsvn/nut/trunk/drivers/apcsmart.c we see the 5 attempts to go to smart mode ('Y' command), but my aim is to understand why it is failing, and how to cleanly solve this without impacting support for other units. Some more questions: - how are you handling the device's permissions? Refer to § II, section 3: http://git.debian.org/?p=collab-maint/nut.git;a=blob_plain;f=debian/nut.README.Debian;hb=HEAD cheers Arnaud -- Linux / Unix Expert RD - Eaton - http://powerquality.eaton.com Network UPS Tools (NUT) Project Leader - http://www.networkupstools.org/ Debian Developer - http://www.debian.org Free Software Developer - http://arnaud.quette.free.fr/ -- Conseiller Municipal - Saint Bernard du Touvet -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to nut in ubuntu. https://bugs.launchpad.net/bugs/535583 Title: Excessive logging by apcsmart program -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
Re: [Bug 535583] Excessive logging by apcsmart program
On Tuesday, 2011-02-15 at 14:16:58 +0100, Arnaud Quette wrote: I would have thought finding the place in the code where it is trying to reset the UPS connection wouldn't be this hard. this is not the problem. This code is in the smartmode() function of apcsmart.c: http://svn.debian.org/wsvn/nut/trunk/drivers/apcsmart.c I'll have a look at that code. we see the 5 attempts to go to smart mode ('Y' command), but my aim is to understand why it is failing, and how to cleanly solve this without impacting support for other units. Of course. The problem is that the program is sending the command infinitely, probably because of the EIO. Some more questions: - how are you handling the device's permissions? Refer to § II, section 3: http://git.debian.org/?p=collab-maint/nut.git;a=blob_plain;f=debian/nut.README.Debian;hb=HEAD /etc/udev/rules.d/zzzlpc.rules: KERNEL==ttyS2, OWNER=nut, GROUP=nut, MODE=0660 The serial line is on a PCI board. It may be a problem of that board, not the UPS. Which is cleared by closing the device. Lupe Christoph -- | It is a well-known fact in any organisation that, if you want a job| | done, you should give it to someone who is already very busy. | | Terry Pratchett, Unseen Academicals | -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to nut in ubuntu. https://bugs.launchpad.net/bugs/535583 Title: Excessive logging by apcsmart program -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
Re: [Bug 535583] Excessive logging by apcsmart program
On Tuesday, 2011-02-15 at 13:16:58 -, Arnaud Quette wrote: this is not the problem. This code is in the smartmode() function of apcsmart.c: http://svn.debian.org/wsvn/nut/trunk/drivers/apcsmart.c we see the 5 attempts to go to smart mode ('Y' command), but my aim is to understand why it is failing, and how to cleanly solve this without impacting support for other units. I found no code that does five attempts. But this code in main.c, starting on Line 618: while (!exit_flag) { struct timeval timeout; gettimeofday(timeout, NULL); timeout.tv_sec += poll_interval; upsdrv_updateinfo(); while (!dstate_poll_fds(timeout, extrafd) !exit_flag) { /* repeat until time is up or extrafd has data */ upsdrv_updateinfo() calls smartmode(). dstate_poll_fds() checks if there is any file descriptor that is available. In our case: select(7, [4 5 6], NULL, NULL, {1, 999837}) = 1 (in [4], left {1, 999835}) FD 4 is the serial line, which is passed to dstate_poll_fds() as extrafd. When there is data that can be read from the UPS no code in dstate_poll_fds() reads from extrafd, there is only code that reads from the other input FDs. The outer loop above also ignores extrafd. exit_flag is never set, so it continues. And because there is an active file descriptor, the select returns immediately (actually it takes two microseconds). The solution is to add code that reads all data from extrafd and discards it because nobody asked for it. I would also close and reopen the serial line in smartmode(). I would prepare a patch if I knew more about the I/O abstractions used in the nut driver code. Sorry. HTH, Lupe Christoph -- | It is a well-known fact in any organisation that, if you want a job| | done, you should give it to someone who is already very busy. | | Terry Pratchett, Unseen Academicals | -- You received this bug notification because you are a member of Ubuntu Server Team, which is subscribed to nut in ubuntu. https://bugs.launchpad.net/bugs/535583 Title: Excessive logging by apcsmart program -- Ubuntu-server-bugs mailing list Ubuntu-server-bugs@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs