[ 
https://issues.apache.org/jira/browse/THRIFT-747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Wilson-Brown updated THRIFT-747:
------------------------------------

    Description: 
If a Thrift C++ Client opens a TSocket, then calls fork(), the child process 
can terminate the parent processes' connection by deleting its copy of the 
parent TSocket.

In particular,
TSocket->close() calls shutdown(socket_,SHUT_RDWR) before close(socket_)


Discussion:

This behaviour is inconsistent, as it is:
  * unlike the unix socket close() semantics - close() only affects the process 
that calls it, and the socket is shut down when all copies of it are closed
  * unlike the python and java code, which (appears) to only use close()

This design choice makes it really difficult to program a Thrift client that 
forks other clients in C++, as the first process to call TSocket->close() 
terminates all copies of the connection. The child process is unable to cleanup 
its copy of the parent's connection - this is a particular issue when using 
shared_ptr because the child process can not even exit().

However, the design choice also prevents deadlock/slowdown issues where a 
forked process holds open a copy of the parent's Thrift connections.

The design choice may also be an attempt to implement the 'block to send then 
close' behaviour described in 
http://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable
However, the shutdown call, and close with the default linger interval of 0 are 
both hard resets.
In the absence of linger/shutdown, the kernel can usually send small thrift 
messages by itself.


Options:

  * The most functional resolution would be to implement 
TSocket->setShutdownOnClose() that allows Thrift users to set their preference 
for shutdown on socket close or delete. However, this change may also need to 
be made to other language libraries.

  * Removing shutdown() from TSocket->close() could break programs that expect 
TSockets not to stay open if children are still running.


TODO:
  * Confirm issue on Linux - see attached test code
  * Decide how to resolve issue
  * Create Patch - see attached TSocket.h & TSocket.cpp from Thrift 0.2.0 (I 
don't know how to generate patches but I'm happy to try and work it out)


  was:
If a Thrift C++ Client opens a TSocket, then calls fork(), the child process 
can terminate the parent processes' connection by deleting its copy of the 
parent TSocket.

In particular,
TSocket->close() calls shutdown(socket_,SHUT_RDWR) before close(socket_)


Discussion:

This behaviour is inconsistent, as it is:
  * unlike the unix socket close() semantics - close() only affects the process 
that calls it, and the socket is shut down when all copies of it are closed
  * unlike the python and java code, which (appears) to only use close()

This design choice makes it really difficult to program a Thrift client that 
forks other clients in C++, as the first process to call TSocket->close() 
terminates all copies of the connection. The child process is unable to cleanup 
its copy of the parent's connection - this is a particular issue when using 
shared_ptr because the child process can not even exit().

However, the design choice also prevents deadlock/slowdown issues where a 
forked process holds open a copy of the parent's Thrift connections.


Options:

  * The most functional resolution would be to implement 
TSocket->setShutdownOnClose() that allows Thrift users to set their preference 
for shutdown on socket close or delete. However, this change may also need to 
be made to other language libraries.

  * Removing shutdown() from TSocket->close() could break programs that expect 
TSockets not to stay open if children are still running.


TODO:
  * Confirm issue on Linux - see attached test code
  * Decide how to resolve issue
  * Create Patch - see attached TSocket.h & TSocket.cpp from Thrift 0.2.0 (I 
don't know how to generate patches but I'm happy to try and work it out)



Added notes about article at 
http://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable
 describing reliable TCP communication

> C++ TSocket->close calls shutdown breaking forked parent process
> ----------------------------------------------------------------
>
>                 Key: THRIFT-747
>                 URL: https://issues.apache.org/jira/browse/THRIFT-747
>             Project: Thrift
>          Issue Type: Bug
>          Components: Library (C++)
>    Affects Versions: 0.2, 0.3
>         Environment: Cygwin 1.7.1 on Windows XP SP3, Thrift 0.2.0 & r760184 & 
> Trunk
>            Reporter: Tim Wilson-Brown
>            Priority: Minor
>         Attachments: thrift_shutdown_example.cpp, TSocket-020-fix.cpp, 
> TSocket-020-fix.h
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> If a Thrift C++ Client opens a TSocket, then calls fork(), the child process 
> can terminate the parent processes' connection by deleting its copy of the 
> parent TSocket.
> In particular,
> TSocket->close() calls shutdown(socket_,SHUT_RDWR) before close(socket_)
> Discussion:
> This behaviour is inconsistent, as it is:
>   * unlike the unix socket close() semantics - close() only affects the 
> process that calls it, and the socket is shut down when all copies of it are 
> closed
>   * unlike the python and java code, which (appears) to only use close()
> This design choice makes it really difficult to program a Thrift client that 
> forks other clients in C++, as the first process to call TSocket->close() 
> terminates all copies of the connection. The child process is unable to 
> cleanup its copy of the parent's connection - this is a particular issue when 
> using shared_ptr because the child process can not even exit().
> However, the design choice also prevents deadlock/slowdown issues where a 
> forked process holds open a copy of the parent's Thrift connections.
> The design choice may also be an attempt to implement the 'block to send then 
> close' behaviour described in 
> http://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable
> However, the shutdown call, and close with the default linger interval of 0 
> are both hard resets.
> In the absence of linger/shutdown, the kernel can usually send small thrift 
> messages by itself.
> Options:
>   * The most functional resolution would be to implement 
> TSocket->setShutdownOnClose() that allows Thrift users to set their 
> preference for shutdown on socket close or delete. However, this change may 
> also need to be made to other language libraries.
>   * Removing shutdown() from TSocket->close() could break programs that 
> expect TSockets not to stay open if children are still running.
> TODO:
>   * Confirm issue on Linux - see attached test code
>   * Decide how to resolve issue
>   * Create Patch - see attached TSocket.h & TSocket.cpp from Thrift 0.2.0 (I 
> don't know how to generate patches but I'm happy to try and work it out)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to