On 2010-02-24 20:50, Brandon S. Allbery KF8NH wrote:
tcpdump 'host ps3 and tcp[tcpflags] 0x27 != 0'
(Indulging in some serious thread necromancy here, but...)
Alright, I've _finally_ got round to doing a dump with leaking file
descriptors (or threads as the case may be).
The bits of lsof
On Mar 25, 2010, at 15:03 , Bardur Arantsson wrote:
On 2010-02-24 20:50, Brandon S. Allbery KF8NH wrote:
tcpdump 'host ps3 and tcp[tcpflags] 0x27 != 0'
The only striking thing I can see about the dump is that there are
22 (conspicuously close to 16) sequences like:
19:45:30.135291 IP
On 2010-02-24 05:10, Brandon S. Allbery KF8NH wrote:
On Feb 21, 2010, at 20:17 , Jeremy Shaw wrote:
The PS3 does do something though. If we were doing a write *and* read
select on the socket, the read select would wakeup. So, it is trying
to notify us that something has happened, but we are not
On Feb 21, 2010, at 20:17 , Jeremy Shaw wrote:
The PS3 does do something though. If we were doing a write *and*
read select on the socket, the read select would wakeup. So, it is
trying to notify us that something has happened, but we are not
seeing it because we are only looking at the
Quoth Brandon S. Allbery KF8NH allb...@ece.cmu.edu,
On Feb 21, 2010, at 20:17 , Jeremy Shaw wrote:
The PS3 does do something though. If we were doing a write *and*
read select on the socket, the read select would wakeup. So, it is
trying to notify us that something has happened, but we are
On Feb 23, 2010, at 23:47 , Donn Cave wrote:
My prediction is that on the contrary, the transition between
functional
and defunct will not be not announced in any way by the peer, but
that's
just guessing. It would be a lot less interesting.
But that's not the issue. The *kernel* is
Jeremy Shaw wrote:
Hello,
I think to make progress on this bug we really need a failing test case that
other people can reproduce.
I have hacked up small server that should reproduce the error (using fdWrite
instead of sendfile). And a small C client which is intended to reproduce
the error --
Taru Karttunen wrote:
Excerpts from Bardur Arantsson's message of Wed Feb 17 21:27:07 +0200 2010:
For sendfile, a timeout of 1 second would probably be fine. The *ONLY*
purpose of threadWaitWrite in the sendfile code is to avoid busy-waiting
on EAGAIN from the native sendfile.
Of course this
Quoth Bardur Arantsson s...@scientician.net,
Taru Karttunen wrote:
Excerpts from Bardur Arantsson's message of Wed Feb 17 21:27:07 +0200 2010:
For sendfile, a timeout of 1 second would probably be fine. The *ONLY*
purpose of threadWaitWrite in the sendfile code is to avoid busy-waiting
on
On Feb 21, 2010, at 11:50 AM, Donn Cave wrote:
The problem is that this definition of `closed' is, precisely,
`failed to respond within 2 seconds.' If there is no observable
difference between a connection that has been abandoned by the PS3,
and a connection that just suffered a momentary
On Sun, Feb 21, 2010 at 6:39 PM, Donn Cave d...@avvanta.com wrote:
Quoth Jeremy Shaw jer...@n-heptane.com,
...
What happens is the PS3 has closed the connection, and if you attempt
to send any more packets the PS3 will tell you it has closed the
connection and the write() / sendfile()
Excerpts from Bardur Arantsson's message of Wed Feb 17 21:27:07 +0200 2010:
For sendfile, a timeout of 1 second would probably be fine. The *ONLY*
purpose of threadWaitWrite in the sendfile code is to avoid busy-waiting
on EAGAIN from the native sendfile.
Of course this will kill connections
Excerpts from Bardur Arantsson's message of Tue Feb 16 23:48:14 +0200 2010:
This cannot be fixed in the sendfile library, it is a
feature of TCP that connections may linger for a long
time unless explicit timeouts are used.
The problem is that the sendfile library *doesn't* wake
up when
On Wed, Feb 17, 2010 at 2:36 AM, Taru Karttunen tar...@taruti.net wrote:
Excerpts from Bardur Arantsson's message of Tue Feb 16 23:48:14 +0200 2010:
This cannot be fixed in the sendfile library, it is a
feature of TCP that connections may linger for a long
time unless explicit timeouts
Jeremy Shaw wrote:
On Wed, Feb 17, 2010 at 2:36 AM, Taru Karttunen tar...@taruti.net wrote:
So for sendfile, instead of threadWaitWrite we could do:
r - timeout (60 * 10^6) threadWaitWrite
case r of
Nothing - ... -- timed out
(Just ()) - ... -- keep going
For sendfile, a timeout of
On Wed, Feb 17, 2010 at 1:27 PM, Bardur Arantsson s...@scientician.netwrote:
(Obviously, if people are using sendfile with something other than
happstack,
it does not help them, but it sounds like trying to fix things in
sendfile is misguided anyway.)
How so? As a user I expect
On Wed, Feb 17, 2010 at 3:54 PM, Jeremy Shaw jer...@n-heptane.com wrote:
On Wed, Feb 17, 2010 at 1:27 PM, Bardur Arantsson s...@scientician.netwrote:
(Obviously, if people are using sendfile with something other than
happstack,
it does not help them, but it sounds like trying to fix
On Sun, Feb 14, 2010 at 2:04 PM, Bardur Arantsson s...@scientician.netwrote:
I've tested this extensively during this weekend and not a single leaked
FD so far.
I think we can safely say that polling an FD for read readiness is
sufficient to properly detect a disconnected client regardless
On Tue, Feb 16, 2010 at 12:37 PM, Jeremy Shaw jer...@n-heptane.com wrote:
I think this goes beyond just a sendfile issue -- anyone trying to write
non-blocking network code should run into this issue, right ?
What's a fairly concise description of the issue at hand? I haven't been
paying
Jeremy Shaw wrote:
On Sun, Feb 14, 2010 at 2:04 PM, Bardur Arantsson s...@scientician.netwrote:
Not sure what the best solution for this would be, API-wise... Maybe
actually have sendfile read the data and supply it to a user-defined
function which could react to the data in some way? (Could
Bardur Arantsson wrote:
Jeremy Shaw wrote:
[--snip--]
Re: a test case, you'll probably need to run the test case code on a
client whose OS allows (from userspace) the sudden dropping of
connections without sending a proper connection shutdown sequence. I'm
not sure that that OS would be.
Excerpts from Bardur Arantsson's message of Tue Feb 16 22:57:23 +0200 2010:
As far as I can tell, all nonblocking networking code is vulnerable to
this issue (unless it actually does use threadWaitRead, obviously :)).
There are a few easy fixes:
1) socket timeouts with
Taru Karttunen wrote:
Excerpts from Bardur Arantsson's message of Tue Feb 16 22:57:23 +0200 2010:
As far as I can tell, all nonblocking networking code is vulnerable to
this issue (unless it actually does use threadWaitRead, obviously :)).
There are a few easy fixes:
1) socket timeouts with
On Tue, Feb 16, 2010 at 3:48 PM, Bardur Arantsson s...@scientician.netwrote:
The problem is that the sendfile library *doesn't* wake
up when the connection is terminated (because of threadWaitWrite)
-- it doesn't matter what the timeout is.
Have we actually confirmed this? We know that with
Jeremy Shaw wrote:
import Control.Concurrent
import Control.Concurrent.MVar
import System.Posix.Types
data RW = Read | Write
threadWaitReadWrite :: Fd - IO RW
threadWaitReadWrite fd =
do m - newEmptyMVar
rid - forkIO $ threadWaitRead fd putMVar m Read
wid - forkIO $
Jeremy Shaw wrote:
import Control.Concurrent
import Control.Concurrent.MVar
import System.Posix.Types
data RW = Read | Write
threadWaitReadWrite :: Fd - IO RW
threadWaitReadWrite fd =
do m - newEmptyMVar
rid - forkIO $ threadWaitRead fd putMVar m Read
wid - forkIO $
On Wed, Feb 10, 2010 at 1:15 PM, Bardur Arantsson s...@scientician.netwrote:
I've also been contemplating some solutions, but I cannot see any solutions
to this problem which could reasonably be implemented outside of GHC itself.
GHC lacks a threadWaitError, so there's no way to detect the
Jeremy Shaw wrote:
On Wed, Feb 10, 2010 at 1:15 PM, Bardur Arantsson s...@scientician.netwrote:
I've also been contemplating some solutions, but I cannot see any solutions
to this problem which could reasonably be implemented outside of GHC itself.
GHC lacks a threadWaitError, so there's no
Bardur Arantsson s...@scientician.net wrote:
...
then do errno - getErrno
if errno == eAGAIN
then do
threadDelay 100
sendfile out_fd in_fd poff bytes
else throwErrno Network.Socket.SendFile.Linux
On Feb 11, 2010, at 1:57 PM, Bardur Arantsson wrote:
2. the remote client has terminated the connection as far as it is
concerned but not notified the server -- when you try to send data
it will
reject it, and send/write/sendfile/etc will raise sigPIPE.
Looking at your debug output, we are
Thomas DuBuisson wrote:
Bardur Arantsson s...@scientician.net wrote:
...
then do errno - getErrno
if errno == eAGAIN
then do
threadDelay 100
sendfile out_fd in_fd poff bytes
else throwErrno
Jeremy Shaw wrote:
On Feb 11, 2010, at 1:57 PM, Bardur Arantsson wrote:
[--snip lots of technical info--]
Thanks for digging so much into this.
Just a couple of comments:
The whole point of the sendfile library is to use sendfile(), so not
using sendfile() seems like the wrong
On Feb 9, 2010, at 6:47 PM, Thomas Hartman wrote:
Matt, have you seen this thread?
Jeremy, are you saying this a bug in the sendfile library on hackage,
or something underlying?
I'm saying that the behavior of the sendfile library is buggy. But it
could be due to something underlying..
Jeremy Shaw wrote:
On Feb 9, 2010, at 6:47 PM, Thomas Hartman wrote:
Matt, have you seen this thread?
Jeremy, are you saying this a bug in the sendfile library on hackage,
or something underlying?
I'm saying that the behavior of the sendfile library is buggy. But it
could be due to
On Sun, Feb 7, 2010 at 9:22 AM, Bardur Arantsson s...@scientician.netwrote:
True, it is perhaps technically not a bug, but it is certainly a misfeature
since there is no easy way (at least AFAICT) to discover that something bad
has happened for the file descriptor and act accordingly. AFAICT
Matt, have you seen this thread?
Jeremy, are you saying this a bug in the sendfile library on hackage,
or something underlying?
thomas.
2010/2/9 Jeremy Shaw jer...@n-heptane.com:
On Sun, Feb 7, 2010 at 9:22 AM, Bardur Arantsson s...@scientician.net
wrote:
True, it is perhaps technically not
Bardur Arantsson wrote:
Bardur Arantsson wrote:
(sorry about replying-to-self)
During yet another bout of debugging, I've added even more I am here
instrumentation code to the SendFile code, and the culprit seems to be
threadWaitWrite.
As Jeremy Shaw pointed out off-list, the symptoms
It's not clear to me that this is actually a bug in threadWaitWrite.
I believe that under Linux, select() does not wakeup just because the file
descriptor was closed. (Under Windows, and possibly solaris/BSD/etc it
does). So this behavior might be consistent with normal Linux behavior.
However,
Jeremy Shaw wrote:
It's not clear to me that this is actually a bug in threadWaitWrite.
I believe that under Linux, select() does not wakeup just because the file
descriptor was closed.
select() has the option of specifying an exceptfds FD_SET where I'd
certainly _expect_ select() to flag an
Brandon S. Allbery KF8NH wrote:
On Feb 5, 2010, at 02:56 , Bardur Arantsson wrote:
[--snip--]
Broken pipe is normally handled as a signal, and is only mapped to an
error if SIGPIPE is set to SIG_IGN. I can well imagine that the SIGPIPE
signal handler isn't closing resources properly; a
On Sat, Feb 06, 2010 at 09:16:35AM +0100, Bardur Arantsson wrote:
Brandon S. Allbery KF8NH wrote:
On Feb 5, 2010, at 02:56 , Bardur Arantsson wrote:
[--snip--]
Broken pipe is normally handled as a signal, and is only mapped
to an error if SIGPIPE is set to SIG_IGN. I can well imagine that
Felipe Lessa wrote:
On Sat, Feb 06, 2010 at 09:16:35AM +0100, Bardur Arantsson wrote:
Brandon S. Allbery KF8NH wrote:
On Feb 5, 2010, at 02:56 , Bardur Arantsson wrote:
[--snip--]
Broken pipe is normally handled as a signal, and is only mapped
to an error if SIGPIPE is set to SIG_IGN. I can
Bardur Arantsson wrote:
(sorry about replying-to-self)
During yet another bout of debugging, I've added even more I am here
instrumentation code to the SendFile code, and the culprit seems to be
threadWaitWrite.
I think I've pretty much confirmed this.
I've changed the code again. This
me too.
2010/2/5 MightyByte mightyb...@gmail.com:
I've been seeing a steady stream of similar resource vanished messages
for as long as I've been running my happstack app. This message I get
is this:
socket: 58: hClose: resource vanished (Broken pipe)
I run my app from a shell script
Jeremy Shaw wrote:
Actually,
We should start by testing if native sendfile leaks file descriptors even
when the whole file is sent. We have a test suite, but I am not sure if it
tests for file handle leaking...
I should have posted this earlier, but the exact message I'm seeing in
the case
I've been seeing a steady stream of similar resource vanished messages
for as long as I've been running my happstack app. This message I get
is this:
socket: 58: hClose: resource vanished (Broken pipe)
I run my app from a shell script inside a while true loop, so it
automatically gets restarted
Thomas Hartman wrote:
Do you have a test script to reproduce the behavior?
Unfortunately not, but the behavior *is* 100% reproducible with
my PS3 client. The production of a leaked FD appears to require a
particularly abrupt disconnect (see my other reply in this thread), so
you're probably
I desperation, I've tried to instrument a couple of the functions in
SendFile:
sendFile'' :: Socket - Handle - Integer - Integer - IO ()
sendFile'' outs inp off count =
do let out_fd = Fd (fdSocket outs)
in_fd - handleToFd inp
putStrLn (in_fd= ++ show in_fd)
On Feb 5, 2010, at 02:56 , Bardur Arantsson wrote:
I should have posted this earlier, but the exact message I'm seeing
in the case where the Bad Client disconnects is this:
hums: Network.Socket.SendFile.Linux: resource vanished (Broken pipe)
Oddly, I haven't been able to reproduce this
49 matches
Mail list logo