One more comment:

* saf_logger.c needs to be updated to handle the error code 
SA_AIS_ERR_TIMEOUT

regards,
Anders Widell

2013-08-16 12:16, Lennart Lund skrev:
> Hi,
>
> See my comments to Anders W below
>
> Thanks'
> Lennart
>
>> -----Original Message-----
>> From: Anders Widell [mailto:anders.wid...@ericsson.com]
>> Sent: den 16 augusti 2013 10:47
>> To: opensaf-devel@lists.sourceforge.net
>> Subject: Re: [devel] [PATCH 0 of 3] Review Request for logsv: Fix hanging
>> main thread when file i/o dont return [#9]
>>
>> Hi!
>>
>> I have some comments:
>>
>> * Time-out limits shall be configurable (according to our new policy on
>> limits)
> Already done. See patch, part 12
>
>> * I think the use code for killing and re-starting the slave thread is 
>> overkill and
>> should be removed. Unless we know (and have seen) that this solves a real
>> problem that can happen in practice.
> Ok, I will remove it
>
>> regards,
>> Anders Widell
>>
>> 2013-08-09 15:20, Lennart Lund skrev:
>>> Summary: logsv: Fix hanging main thread when file i/o don't return
>>> Review request for Trac Ticket(s): #9 Peer Reviewer(s): Madhurika
>>> Koppula, (Anders Widell, Hans Feldt) Pull request to: NA Affected
>>> branch(es): devel (4.4) Development branch: <<IF ANY GIVE THE REPO
>>> URL>>
>>>
>>>
>>> --------------------------------
>>> Impacted area       Impact y/n
>>> --------------------------------
>>>    Docs                    n
>>>    Build system            n
>>>    RPM/packaging           n
>>>    Configuration files     n
>>>    Startup scripts         n
>>>    SAF services            y
>>>    OpenSAF services        n
>>>    Core libraries          n
>>>    Samples                 n
>>>    Tests                   n
>>>    Other                   n
>>>
>>>
>>> Comments (indicate scope for each "y" above):
>>> ---------------------------------------------
>>> In order to protect the log server "main thread" (MT) from hanging if
>>> a file operation like write, mkdir etc. does not return, all such
>>> operations are done in a separate "file thread" (FT).
>>> Functions running in the "Main Thread" (MT) that needs file system
>>> operations handle over the execution to the FT when file handling has
>>> to be done. Execution is then given back to the MT again. If a file
>>> operation does not return FT will hang but MT will time out the FT and
>> resume. A timeout will be handled as a file operation fail.
>>> The MT can detect if the FT is hanging and new requests for file operations
>> will be "failed".
>>> Note1: This is an add on to the patches sent out in prevoius review
>> requests.
>>> Note2: The last patch (part 11); The non block handling of log files that 
>>> was
>> suggested by Madhurika
>>>          is contained in its' own patch.
>>>
>>>
>>> changeset 43a3e4173f05a01aa595b3d770a1464a3338f32e
>>> Author:     Lennart Lund <lennart.l...@ericsson.com>
>>> Date:       Fri, 09 Aug 2013 13:46:36 +0200
>>>
>>>     logsv: Fix hanging main thread when file i/o don't return. [#9]. Part
>>> 9
>>>
>>>     - Remove unnecessary data copying in log_file_api() and
>> file_hndl_thread()
>>>     - Return SA_AIS_ERR_TIMEOUT if the write operation time out when
>> a log
>>>     record shall be written. If the file thread is already "hanging" when a
>>>     write is requested no attempt to write is made and
>> SA_AIS_ERR_TRY_AGAIN is
>>>     returned as before.
>>>     - Try to recover file thread by recreating it if it hangs for a long 
>>> time.
>>>     - Recover if bad file descriptor or stale NFS handle.
>>>
>>> changeset d3d78dc3ad87e083e411e0ce5436bcca511d54d6
>>> Author:     Lennart Lund <lennart.l...@ericsson.com>
>>> Date:       Fri, 09 Aug 2013 13:46:36 +0200
>>>
>>>     logsv: Fix hanging main thread when file i/o don't return. [#9]. Part
>>> 10
>>>
>>>     - Always reinitialize/reopen log files if a write operation fails, 
>>> timeout
>>>     of file thread (hanging file system) included.
>>>     - Handle synchronization between nodes when log files cannot be
>> created before
>>>     a switch over without using any new flag that has to be checkpointed
>>>     (remove "files_initialized" flag)
>>>     - Incorrect handling of "partial write" is fixed. See #536
>>>
>>> changeset b8a2060ff5fd6d2685f95757d422addeca1ebdb0
>>> Author:     Lennart Lund <lennart.l...@ericsson.com>
>>> Date:       Fri, 09 Aug 2013 13:46:36 +0200
>>>
>>>     logsv: Fix hanging main thread when file i/o don't return. [#9]. Part
>>> 11
>>>
>>>     - Open log files with O_NONBLOCK. Answer client with
>> AIS_ERR_TIMEOUT if
>>>     EWOULDBLOCK/EAGAIN (record may be parially written)
>>>
>>>
>>> Complete diffstat:
>>> ------------------
>>>    osaf/services/saf/logsv/lgs/lgs_evt.c     |   11 ++--
>>>    osaf/services/saf/logsv/lgs/lgs_file.c    |  197
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> +++++-------------------------
>>>    osaf/services/saf/logsv/lgs/lgs_filehdl.c |    2 +-
>>>    osaf/services/saf/logsv/lgs/lgs_imm.c     |    1 -
>>>    osaf/services/saf/logsv/lgs/lgs_mbcsv.c   |    7 ---
>>>    osaf/services/saf/logsv/lgs/lgs_mbcsv.h   |    3 -
>>>    osaf/services/saf/logsv/lgs/lgs_stream.c  |  115
>> +++++++++++++++++++++------------------------------
>>>    osaf/services/saf/logsv/lgs/lgs_stream.h  |    1 -
>>>    8 files changed, 196 insertions(+), 141 deletions(-)
>>>
>>>
>>> Testing Commands:
>>> -----------------
>>> See previous review
>>>
>>>
>>> Testing, Expected Results:
>>> --------------------------
>>>    <<PASTE COMMAND OUTPUTS / TEST RESULTS>>
>>>
>>>
>>> Conditions of Submission:
>>> -------------------------
>>>    <<HOW MANY DAYS BEFORE PUSHING, CONSENSUS ETC>>
>>>
>>>
>>> Arch      Built     Started    Linux distro
>>> -------------------------------------------
>>> mips        n          n
>>> mips64      n          n
>>> x86         n          n
>>> x86_64      n          n
>>> powerpc     n          n
>>> powerpc64   n          n
>>>
>>>
>>> Reviewer Checklist:
>>> -------------------
>>> [Submitters: make sure that your review doesn't trigger any
>>> checkmarks!]
>>>
>>>
>>> Your checkin has not passed review because (see checked entries):
>>>
>>> ___ Your RR template is generally incomplete; it has too many blank entries
>>>       that need proper data filled in.
>>>
>>> ___ You have failed to nominate the proper persons for review and push.
>>>
>>> ___ Your patches do not have proper short+long header
>>>
>>> ___ You have grammar/spelling in your header that is unacceptable.
>>>
>>> ___ You have exceeded a sensible line length in your
>> headers/comments/text.
>>> ___ You have failed to put in a proper Trac Ticket # into your commits.
>>>
>>> ___ You have incorrectly put/left internal data in your comments/files
>>>       (i.e. internal bug tracking tool IDs, product names etc)
>>>
>>> ___ You have not given any evidence of testing beyond basic build tests.
>>>       Demonstrate some level of runtime or other sanity testing.
>>>
>>> ___ You have ^M present in some of your files. These have to be removed.
>>>
>>> ___ You have needlessly changed whitespace or added whitespace crimes
>>>       like trailing spaces, or spaces before tabs.
>>>
>>> ___ You have mixed real technical changes with whitespace and other
>>>       cosmetic code cleanup changes. These have to be separate commits.
>>>
>>> ___ You need to refactor your submission into logical chunks; there is
>>>       too much content into a single commit.
>>>
>>> ___ You have extraneous garbage in your review (merge commits etc)
>>>
>>> ___ You have giant attachments which should never have been sent;
>>>       Instead you should place your content in a public tree to be pulled.
>>>
>>> ___ You have too many commits attached to an e-mail; resend as threaded
>>>       commits, or place in a public tree for a pull.
>>>
>>> ___ You have resent this content multiple times without a clear indication
>>>       of what has changed between each re-send.
>>>
>>> ___ You have failed to adequately and individually address all of the
>>>       comments and change requests that were proposed in the initial review.
>>>
>>> ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)
>>>
>>> ___ Your computer have a badly configured date and time; confusing the
>>>       the threaded patch review.
>>>
>>> ___ Your changes affect IPC mechanism, and you don't present any results
>>>       for in-service upgradability test.
>>>
>>> ___ Your changes affect user manual and documentation, your patch series
>>>       do not contain the patch that updates the Doxygen manual.
>>>
>>>
>>> ----------------------------------------------------------------------
>>> -------- Get 100% visibility into Java/.NET code with AppDynamics
>>> Lite!
>>> It's a free troubleshooting tool designed for production.
>>> Get down to code-level detail for bottlenecks, with <2% overhead.
>>> Download for free and get started troubleshooting in minutes.
>>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.c
>>> lktrk _______________________________________________
>>> Opensaf-devel mailing list
>>> Opensaf-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
>>>
>>>
>>
>> ------------------------------------------------------------------------------
>> Get 100% visibility into Java/.NET code with AppDynamics Lite!
>> It's a free troubleshooting tool designed for production.
>> Get down to code-level detail for bottlenecks, with <2% overhead.
>> Download for free and get started troubleshooting in minutes.
>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clk
>> trk
>> _______________________________________________
>> Opensaf-devel mailing list
>> Opensaf-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel


------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead. 
Download for free and get started troubleshooting in minutes. 
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to