Hi, See my comments to Anders W below
Thanks' Lennart > -----Original Message----- > From: Anders Widell [mailto:anders.wid...@ericsson.com] > Sent: den 16 augusti 2013 10:47 > To: opensaf-devel@lists.sourceforge.net > Subject: Re: [devel] [PATCH 0 of 3] Review Request for logsv: Fix hanging > main thread when file i/o dont return [#9] > > Hi! > > I have some comments: > > * Time-out limits shall be configurable (according to our new policy on > limits) Already done. See patch, part 12 > > * I think the use code for killing and re-starting the slave thread is > overkill and > should be removed. Unless we know (and have seen) that this solves a real > problem that can happen in practice. Ok, I will remove it > > regards, > Anders Widell > > 2013-08-09 15:20, Lennart Lund skrev: > > Summary: logsv: Fix hanging main thread when file i/o don't return > > Review request for Trac Ticket(s): #9 Peer Reviewer(s): Madhurika > > Koppula, (Anders Widell, Hans Feldt) Pull request to: NA Affected > > branch(es): devel (4.4) Development branch: <<IF ANY GIVE THE REPO > > URL>> > > > > > > -------------------------------- > > Impacted area Impact y/n > > -------------------------------- > > Docs n > > Build system n > > RPM/packaging n > > Configuration files n > > Startup scripts n > > SAF services y > > OpenSAF services n > > Core libraries n > > Samples n > > Tests n > > Other n > > > > > > Comments (indicate scope for each "y" above): > > --------------------------------------------- > > In order to protect the log server "main thread" (MT) from hanging if > > a file operation like write, mkdir etc. does not return, all such > > operations are done in a separate "file thread" (FT). > > Functions running in the "Main Thread" (MT) that needs file system > > operations handle over the execution to the FT when file handling has > > to be done. Execution is then given back to the MT again. If a file > > operation does not return FT will hang but MT will time out the FT and > resume. A timeout will be handled as a file operation fail. > > The MT can detect if the FT is hanging and new requests for file operations > will be "failed". > > > > Note1: This is an add on to the patches sent out in prevoius review > requests. > > Note2: The last patch (part 11); The non block handling of log files that > > was > suggested by Madhurika > > is contained in its' own patch. > > > > > > changeset 43a3e4173f05a01aa595b3d770a1464a3338f32e > > Author: Lennart Lund <lennart.l...@ericsson.com> > > Date: Fri, 09 Aug 2013 13:46:36 +0200 > > > > logsv: Fix hanging main thread when file i/o don't return. [#9]. Part > > 9 > > > > - Remove unnecessary data copying in log_file_api() and > file_hndl_thread() > > - Return SA_AIS_ERR_TIMEOUT if the write operation time out when > a log > > record shall be written. If the file thread is already "hanging" when a > > write is requested no attempt to write is made and > SA_AIS_ERR_TRY_AGAIN is > > returned as before. > > - Try to recover file thread by recreating it if it hangs for a long > > time. > > - Recover if bad file descriptor or stale NFS handle. > > > > changeset d3d78dc3ad87e083e411e0ce5436bcca511d54d6 > > Author: Lennart Lund <lennart.l...@ericsson.com> > > Date: Fri, 09 Aug 2013 13:46:36 +0200 > > > > logsv: Fix hanging main thread when file i/o don't return. [#9]. Part > > 10 > > > > - Always reinitialize/reopen log files if a write operation fails, > > timeout > > of file thread (hanging file system) included. > > - Handle synchronization between nodes when log files cannot be > created before > > a switch over without using any new flag that has to be checkpointed > > (remove "files_initialized" flag) > > - Incorrect handling of "partial write" is fixed. See #536 > > > > changeset b8a2060ff5fd6d2685f95757d422addeca1ebdb0 > > Author: Lennart Lund <lennart.l...@ericsson.com> > > Date: Fri, 09 Aug 2013 13:46:36 +0200 > > > > logsv: Fix hanging main thread when file i/o don't return. [#9]. Part > > 11 > > > > - Open log files with O_NONBLOCK. Answer client with > AIS_ERR_TIMEOUT if > > EWOULDBLOCK/EAGAIN (record may be parially written) > > > > > > Complete diffstat: > > ------------------ > > osaf/services/saf/logsv/lgs/lgs_evt.c | 11 ++-- > > osaf/services/saf/logsv/lgs/lgs_file.c | 197 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > +++++------------------------- > > osaf/services/saf/logsv/lgs/lgs_filehdl.c | 2 +- > > osaf/services/saf/logsv/lgs/lgs_imm.c | 1 - > > osaf/services/saf/logsv/lgs/lgs_mbcsv.c | 7 --- > > osaf/services/saf/logsv/lgs/lgs_mbcsv.h | 3 - > > osaf/services/saf/logsv/lgs/lgs_stream.c | 115 > +++++++++++++++++++++------------------------------ > > osaf/services/saf/logsv/lgs/lgs_stream.h | 1 - > > 8 files changed, 196 insertions(+), 141 deletions(-) > > > > > > Testing Commands: > > ----------------- > > See previous review > > > > > > Testing, Expected Results: > > -------------------------- > > <<PASTE COMMAND OUTPUTS / TEST RESULTS>> > > > > > > Conditions of Submission: > > ------------------------- > > <<HOW MANY DAYS BEFORE PUSHING, CONSENSUS ETC>> > > > > > > Arch Built Started Linux distro > > ------------------------------------------- > > mips n n > > mips64 n n > > x86 n n > > x86_64 n n > > powerpc n n > > powerpc64 n n > > > > > > Reviewer Checklist: > > ------------------- > > [Submitters: make sure that your review doesn't trigger any > > checkmarks!] > > > > > > Your checkin has not passed review because (see checked entries): > > > > ___ Your RR template is generally incomplete; it has too many blank entries > > that need proper data filled in. > > > > ___ You have failed to nominate the proper persons for review and push. > > > > ___ Your patches do not have proper short+long header > > > > ___ You have grammar/spelling in your header that is unacceptable. > > > > ___ You have exceeded a sensible line length in your > headers/comments/text. > > > > ___ You have failed to put in a proper Trac Ticket # into your commits. > > > > ___ You have incorrectly put/left internal data in your comments/files > > (i.e. internal bug tracking tool IDs, product names etc) > > > > ___ You have not given any evidence of testing beyond basic build tests. > > Demonstrate some level of runtime or other sanity testing. > > > > ___ You have ^M present in some of your files. These have to be removed. > > > > ___ You have needlessly changed whitespace or added whitespace crimes > > like trailing spaces, or spaces before tabs. > > > > ___ You have mixed real technical changes with whitespace and other > > cosmetic code cleanup changes. These have to be separate commits. > > > > ___ You need to refactor your submission into logical chunks; there is > > too much content into a single commit. > > > > ___ You have extraneous garbage in your review (merge commits etc) > > > > ___ You have giant attachments which should never have been sent; > > Instead you should place your content in a public tree to be pulled. > > > > ___ You have too many commits attached to an e-mail; resend as threaded > > commits, or place in a public tree for a pull. > > > > ___ You have resent this content multiple times without a clear indication > > of what has changed between each re-send. > > > > ___ You have failed to adequately and individually address all of the > > comments and change requests that were proposed in the initial review. > > > > ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc) > > > > ___ Your computer have a badly configured date and time; confusing the > > the threaded patch review. > > > > ___ Your changes affect IPC mechanism, and you don't present any results > > for in-service upgradability test. > > > > ___ Your changes affect user manual and documentation, your patch series > > do not contain the patch that updates the Doxygen manual. > > > > > > ---------------------------------------------------------------------- > > -------- Get 100% visibility into Java/.NET code with AppDynamics > > Lite! > > It's a free troubleshooting tool designed for production. > > Get down to code-level detail for bottlenecks, with <2% overhead. > > Download for free and get started troubleshooting in minutes. > > http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.c > > lktrk _______________________________________________ > > Opensaf-devel mailing list > > Opensaf-devel@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/opensaf-devel > > > > > > > ------------------------------------------------------------------------------ > Get 100% visibility into Java/.NET code with AppDynamics Lite! > It's a free troubleshooting tool designed for production. > Get down to code-level detail for bottlenecks, with <2% overhead. > Download for free and get started troubleshooting in minutes. > http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clk > trk > _______________________________________________ > Opensaf-devel mailing list > Opensaf-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/opensaf-devel ------------------------------------------------------------------------------ Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting tool designed for production. Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel