Hi,

See my comments to Anders W below

Thanks'
Lennart

> -----Original Message-----
> From: Anders Widell [mailto:anders.wid...@ericsson.com]
> Sent: den 16 augusti 2013 10:47
> To: opensaf-devel@lists.sourceforge.net
> Subject: Re: [devel] [PATCH 0 of 3] Review Request for logsv: Fix hanging
> main thread when file i/o dont return [#9]
> 
> Hi!
> 
> I have some comments:
> 
> * Time-out limits shall be configurable (according to our new policy on
> limits)

Already done. See patch, part 12

> 
> * I think the use code for killing and re-starting the slave thread is 
> overkill and
> should be removed. Unless we know (and have seen) that this solves a real
> problem that can happen in practice.

Ok, I will remove it

> 
> regards,
> Anders Widell
> 
> 2013-08-09 15:20, Lennart Lund skrev:
> > Summary: logsv: Fix hanging main thread when file i/o don't return
> > Review request for Trac Ticket(s): #9 Peer Reviewer(s): Madhurika
> > Koppula, (Anders Widell, Hans Feldt) Pull request to: NA Affected
> > branch(es): devel (4.4) Development branch: <<IF ANY GIVE THE REPO
> > URL>>
> >
> >
> > --------------------------------
> > Impacted area       Impact y/n
> > --------------------------------
> >   Docs                    n
> >   Build system            n
> >   RPM/packaging           n
> >   Configuration files     n
> >   Startup scripts         n
> >   SAF services            y
> >   OpenSAF services        n
> >   Core libraries          n
> >   Samples                 n
> >   Tests                   n
> >   Other                   n
> >
> >
> > Comments (indicate scope for each "y" above):
> > ---------------------------------------------
> > In order to protect the log server "main thread" (MT) from hanging if
> > a file operation like write, mkdir etc. does not return, all such
> > operations are done in a separate "file thread" (FT).
> > Functions running in the "Main Thread" (MT) that needs file system
> > operations handle over the execution to the FT when file handling has
> > to be done. Execution is then given back to the MT again. If a file
> > operation does not return FT will hang but MT will time out the FT and
> resume. A timeout will be handled as a file operation fail.
> > The MT can detect if the FT is hanging and new requests for file operations
> will be "failed".
> >
> > Note1: This is an add on to the patches sent out in prevoius review
> requests.
> > Note2: The last patch (part 11); The non block handling of log files that 
> > was
> suggested by Madhurika
> >         is contained in its' own patch.
> >
> >
> > changeset 43a3e4173f05a01aa595b3d770a1464a3338f32e
> > Author:     Lennart Lund <lennart.l...@ericsson.com>
> > Date:       Fri, 09 Aug 2013 13:46:36 +0200
> >
> >     logsv: Fix hanging main thread when file i/o don't return. [#9]. Part
> > 9
> >
> >     - Remove unnecessary data copying in log_file_api() and
> file_hndl_thread()
> >     - Return SA_AIS_ERR_TIMEOUT if the write operation time out when
> a log
> >     record shall be written. If the file thread is already "hanging" when a
> >     write is requested no attempt to write is made and
> SA_AIS_ERR_TRY_AGAIN is
> >     returned as before.
> >     - Try to recover file thread by recreating it if it hangs for a long 
> > time.
> >     - Recover if bad file descriptor or stale NFS handle.
> >
> > changeset d3d78dc3ad87e083e411e0ce5436bcca511d54d6
> > Author:     Lennart Lund <lennart.l...@ericsson.com>
> > Date:       Fri, 09 Aug 2013 13:46:36 +0200
> >
> >     logsv: Fix hanging main thread when file i/o don't return. [#9]. Part
> > 10
> >
> >     - Always reinitialize/reopen log files if a write operation fails, 
> > timeout
> >     of file thread (hanging file system) included.
> >     - Handle synchronization between nodes when log files cannot be
> created before
> >     a switch over without using any new flag that has to be checkpointed
> >     (remove "files_initialized" flag)
> >     - Incorrect handling of "partial write" is fixed. See #536
> >
> > changeset b8a2060ff5fd6d2685f95757d422addeca1ebdb0
> > Author:     Lennart Lund <lennart.l...@ericsson.com>
> > Date:       Fri, 09 Aug 2013 13:46:36 +0200
> >
> >     logsv: Fix hanging main thread when file i/o don't return. [#9]. Part
> > 11
> >
> >     - Open log files with O_NONBLOCK. Answer client with
> AIS_ERR_TIMEOUT if
> >     EWOULDBLOCK/EAGAIN (record may be parially written)
> >
> >
> > Complete diffstat:
> > ------------------
> >   osaf/services/saf/logsv/lgs/lgs_evt.c     |   11 ++--
> >   osaf/services/saf/logsv/lgs/lgs_file.c    |  197
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> +++++-------------------------
> >   osaf/services/saf/logsv/lgs/lgs_filehdl.c |    2 +-
> >   osaf/services/saf/logsv/lgs/lgs_imm.c     |    1 -
> >   osaf/services/saf/logsv/lgs/lgs_mbcsv.c   |    7 ---
> >   osaf/services/saf/logsv/lgs/lgs_mbcsv.h   |    3 -
> >   osaf/services/saf/logsv/lgs/lgs_stream.c  |  115
> +++++++++++++++++++++------------------------------
> >   osaf/services/saf/logsv/lgs/lgs_stream.h  |    1 -
> >   8 files changed, 196 insertions(+), 141 deletions(-)
> >
> >
> > Testing Commands:
> > -----------------
> > See previous review
> >
> >
> > Testing, Expected Results:
> > --------------------------
> >   <<PASTE COMMAND OUTPUTS / TEST RESULTS>>
> >
> >
> > Conditions of Submission:
> > -------------------------
> >   <<HOW MANY DAYS BEFORE PUSHING, CONSENSUS ETC>>
> >
> >
> > Arch      Built     Started    Linux distro
> > -------------------------------------------
> > mips        n          n
> > mips64      n          n
> > x86         n          n
> > x86_64      n          n
> > powerpc     n          n
> > powerpc64   n          n
> >
> >
> > Reviewer Checklist:
> > -------------------
> > [Submitters: make sure that your review doesn't trigger any
> > checkmarks!]
> >
> >
> > Your checkin has not passed review because (see checked entries):
> >
> > ___ Your RR template is generally incomplete; it has too many blank entries
> >      that need proper data filled in.
> >
> > ___ You have failed to nominate the proper persons for review and push.
> >
> > ___ Your patches do not have proper short+long header
> >
> > ___ You have grammar/spelling in your header that is unacceptable.
> >
> > ___ You have exceeded a sensible line length in your
> headers/comments/text.
> >
> > ___ You have failed to put in a proper Trac Ticket # into your commits.
> >
> > ___ You have incorrectly put/left internal data in your comments/files
> >      (i.e. internal bug tracking tool IDs, product names etc)
> >
> > ___ You have not given any evidence of testing beyond basic build tests.
> >      Demonstrate some level of runtime or other sanity testing.
> >
> > ___ You have ^M present in some of your files. These have to be removed.
> >
> > ___ You have needlessly changed whitespace or added whitespace crimes
> >      like trailing spaces, or spaces before tabs.
> >
> > ___ You have mixed real technical changes with whitespace and other
> >      cosmetic code cleanup changes. These have to be separate commits.
> >
> > ___ You need to refactor your submission into logical chunks; there is
> >      too much content into a single commit.
> >
> > ___ You have extraneous garbage in your review (merge commits etc)
> >
> > ___ You have giant attachments which should never have been sent;
> >      Instead you should place your content in a public tree to be pulled.
> >
> > ___ You have too many commits attached to an e-mail; resend as threaded
> >      commits, or place in a public tree for a pull.
> >
> > ___ You have resent this content multiple times without a clear indication
> >      of what has changed between each re-send.
> >
> > ___ You have failed to adequately and individually address all of the
> >      comments and change requests that were proposed in the initial review.
> >
> > ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)
> >
> > ___ Your computer have a badly configured date and time; confusing the
> >      the threaded patch review.
> >
> > ___ Your changes affect IPC mechanism, and you don't present any results
> >      for in-service upgradability test.
> >
> > ___ Your changes affect user manual and documentation, your patch series
> >      do not contain the patch that updates the Doxygen manual.
> >
> >
> > ----------------------------------------------------------------------
> > -------- Get 100% visibility into Java/.NET code with AppDynamics
> > Lite!
> > It's a free troubleshooting tool designed for production.
> > Get down to code-level detail for bottlenecks, with <2% overhead.
> > Download for free and get started troubleshooting in minutes.
> > http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.c
> > lktrk _______________________________________________
> > Opensaf-devel mailing list
> > Opensaf-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/opensaf-devel
> >
> >
> 
> 
> ------------------------------------------------------------------------------
> Get 100% visibility into Java/.NET code with AppDynamics Lite!
> It's a free troubleshooting tool designed for production.
> Get down to code-level detail for bottlenecks, with <2% overhead.
> Download for free and get started troubleshooting in minutes.
> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clk
> trk
> _______________________________________________
> Opensaf-devel mailing list
> Opensaf-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to