[issue11812] transient socket failure to connect to 'localhost'

2011-12-30 Thread Charles-François Natali
Charles-François Natali added the comment: Seems to be fixed now. -- resolution: -> fixed stage: needs patch -> committed/rejected status: open -> closed ___ Python tracker ___

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Roundup Robot
Roundup Robot added the comment: New changeset 76b6b85e4b78 by Jesus Cea in branch '2.7': Solved a potential deadlock in test_telnetlib.py. Related to issue #11812 http://hg.python.org/cpython/rev/76b6b85e4b78 New changeset 554802e562fa by Jesus Cea in branch '2.7': Partial patch for issue #118

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Stupid mistake. Please, review b93657b239a5.diff (erroneous "sock.close()" > deleted) Looks good to me, thanks. -- ___ Python tracker ___ _

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Jesús Cea Avión
Jesús Cea Avión added the comment: Stupid mistake. Please, review b93657b239a5.diff (erroneous "sock.close()" deleted) -- ___ Python tracker ___ ___

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Jesús Cea Avión
Changes by Jesús Cea Avión : Removed file: http://bugs.python.org/file23631/71ab454bfe19.diff ___ Python tracker ___ ___ Python-bugs-list mail

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Jesús Cea Avión
Changes by Jesús Cea Avión : Added file: http://bugs.python.org/file23632/b93657b239a5.diff ___ Python tracker ___ ___ Python-bugs-list mailin

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Jesús Cea Avión
Jesús Cea Avión added the comment: Please, review 71ab454bfe19.diff . I am not satisfied with the timeout approach, since the timeout time is arbitrary. I would rather do the fake connection in teardowm, to be sure the server died. Anyway, this seems to be the minimal patch to solve the prob

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Jesús Cea Avión
Changes by Jesús Cea Avión : Added file: http://bugs.python.org/file23631/71ab454bfe19.diff ___ Python tracker ___ ___ Python-bugs-list mailin

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Consider too that if something goes bad enough in the test to skip the > teardown method, Such as? tearDown is normally like a "finally" block, it always gets executed (unless perhaps setUp fails). -- ___ Python t

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Jesús Cea Avión
Jesús Cea Avión added the comment: Consider too that if something goes bad enough in the test to skip the teardown method, the thread will be alive for a while, possibly contaminating some other tests, like you commented. This is actually unsolvable, I think. Code that NEED to be executed wit

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Antoine: Then you would be satisfied if I increase the timeout from 3 > seconds to 60 seconds and clean the event signaling? Yes! -- ___ Python tracker

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Jesús Cea Avión
Jesús Cea Avión added the comment: Antoine: Then you would be satisfied if I increase the timeout from 3 seconds to 60 seconds and clean the event signaling?. The current event signaling code has a few race conditions with potential deadlocks. -- _

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Antoine, the problem with this test is the timeout. We can set an > arbitrary timeout, but how big is big enough?. I would say answering this question is your task, since you have access to that buildbot. > The only "cosmetic" problem is the risk of "leaking

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Jesús Cea Avión
Jesús Cea Avión added the comment: Antoine, the problem with this test is the timeout. We can set an arbitrary timeout, but how big is big enough?. My change doesn't need a timeout at all. Problem solved. The only "cosmetic" problem is the risk of "leaking" a thread. But it would not affect

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: > If thread.join had a timeout , we could wait for a while and if the > thread is still active, do a fake connection and another join. What's wrong with a socket timeout exactly? Everything you're proposing is ten times more complicated, and more fragile.

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Please, review attached changeset. Doesn't look acceptable to me. -- ___ Python tracker ___ ___

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Michael Foord
Changes by Michael Foord : -- nosy: -michael.foord ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Jesús Cea Avión
Jesús Cea Avión added the comment: Antoine: Deleting the socket timeout doesn't hang the test if we set the thread to "daemon" and do not do a thread.join() (unneeded in the normal situation, since garbage collecting the test instance will collect the thread too). If you don't like this, I c

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Jesús Cea Avión
Changes by Jesús Cea Avión : Added file: http://bugs.python.org/file23630/2b155a6d25bb.diff ___ Python tracker ___ ___ Python-bugs-list mailin

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Jesús Cea Avión
Jesús Cea Avión added the comment: Uh doing a fake connection in the teardown would be problematic if the socket is reused for something else in the meantime. The kernel is suppose to keep the socket in the "not reuse" state for a while, but... I am seeing too liberal mixing of suppor

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Anyway, we can keep using "localhost", but just delete the socket > timeout in the server. Please don't. Any problem might then hang the whole test suite. You can bump it up if you want, though. > About using getsockname(), the bind would bind to all IPs of

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Jesús Cea Avión
Jesús Cea Avión added the comment: Charles-François: The only way for the server thread being around would be if the test fails badly, not calling teardown (I would do a fake tcp connection to the server in the teardown, followed by a thread.join). In this case, the thread (being "daemon") w

[issue11812] transient socket failure to connect to 'localhost'

2011-11-08 Thread Charles-François Natali
Charles-François Natali added the comment: > The server thread only waits for 3 seconds for the connection. If a > connection is not created before 3 seconds, the server suicides and when the > connection is tried, it will fail. This probably explain why the problem is > sporadic and seems to

[issue11812] transient socket failure to connect to 'localhost'

2011-11-07 Thread Jesús Cea Avión
Jesús Cea Avión added the comment: About the "support.HOST", changing from "localhost" to "127.0.0.1" could be problematic is servers without IPv4 support (servers IPv6 only). I guess this is a theorical problem so far, and that when we find this issue the exception would be pretty obvious...

[issue11812] transient socket failure to connect to 'localhost'

2011-11-07 Thread Jesús Cea Avión
Jesús Cea Avión added the comment: Checking the testsuite source code, I see several issues: The server thread only waits for 3 seconds for the connection. If a connection is not created before 3 seconds, the server suicides and when the connection is tried, it will fail. This probably explai

[issue11812] transient socket failure to connect to 'localhost'

2011-11-03 Thread Charles-François Natali
Charles-François Natali added the comment: > I explain a reliable method to reproduce this issue on Linux It's a way to reproduce the symptom (i.e. connection refused because you're trying to connect to 127.0.0.2 while the server is listening on 127.0.0.1), but not the cause: if the server bi

[issue11812] transient socket failure to connect to 'localhost'

2011-11-03 Thread STINNER Victor
STINNER Victor added the comment: > Sure, if you have access to a machine on which you can > reliably reproduce the problem, it'll be much easier. I explain a reliable method to reproduce this issue on Linux (it may work on other OSes) in msg138882. -- ___

[issue11812] transient socket failure to connect to 'localhost'

2011-10-31 Thread Charles-François Natali
Charles-François Natali added the comment: > Is anybody activelly working on this?. I don't think so. > Should I get involved? Sure, if you have access to a machine on which you can reliably reproduce the problem, it'll be much easier. I would bet on a deficient name resolution service: usi

[issue11812] transient socket failure to connect to 'localhost'

2011-10-31 Thread Jesús Cea Avión
Jesús Cea Avión added the comment: Any progress on this?. I still see frequent OpenIndiana Buildbots failures because of this. Is anybody activelly working on this?. Should I get involved? -- ___ Python tracker _

[issue11812] transient socket failure to connect to 'localhost'

2011-10-08 Thread Charles-François Natali
Charles-François Natali added the comment: > Attached patch reads the name of the server socket instead of using > HOST or 'localhost'. > By the way, why do we use 'localhost' instead of '127.0.0.1' for > support.HOST? '127.0.0.1' doesn't depend on the DNS configuration of > the host (especiall

[issue11812] transient socket failure to connect to 'localhost'

2011-09-09 Thread Jesús Cea Avión
Jesús Cea Avión added the comment: I am seeing this failure from time to time in OpenIndiana buildbots. For instance http://www.python.org/dev/buildbot/all/builders/AMD64%20OpenIndiana%203.x/builds/1751/steps/test/logs/stdio Seems a clear race condition. -- nosy: +jcea _

[issue11812] transient socket failure to connect to 'localhost'

2011-06-23 Thread Terry J. Reedy
Terry J. Reedy added the comment: Perhaps Michael or Ezio have an idea of whether 'reason' or 'happenstance' is the answer to your questions. -- nosy: +ezio.melotti, michael.foord ___ Python tracker _

[issue11812] transient socket failure to connect to 'localhost'

2011-06-23 Thread STINNER Victor
STINNER Victor added the comment: Some tests of test_ftplib and test_telnetlib use HOST or directly 'localhost' instead of getting the host from the server socket. About the test_ftplib failures, only the tests using explicitly 'localhost' do fail. Attached patch reads the name of the server

[issue11812] transient socket failure to connect to 'localhost'

2011-06-23 Thread STINNER Victor
STINNER Victor added the comment: > I only saw the failure on test_telnetlib, not in other tests > using sockets. Oh, the last failure of the buildbot "x86 Windows7 3.x" is on test_ftplib, not test_telnetlib! == ERROR: testTi

[issue11812] transient socket failure to connect to 'localhost'

2011-06-23 Thread STINNER Victor
STINNER Victor added the comment: Does the failure occur on other buildbots? If not, it's maybe something specific to this Windows Seven: a local firewall or something like that? Can we use start 127.0.0.1 instead of "localhost"? I don't know if it would change anything. Note: the TCP server

[issue11812] transient socket failure to connect to 'localhost'

2011-06-23 Thread STINNER Victor
STINNER Victor added the comment: > With a bit of searching, HOST == support.HOST == 'localhost'. > Looking at the traceback, it is socket that fails, not telnetlib > or its test. I only saw the failure on test_telnetlib, not in other tests using sockets. I think that this issue is specific to

[issue11812] transient socket failure to connect to 'localhost'

2011-06-23 Thread Terry J. Reedy
Terry J. Reedy added the comment: With a bit of searching, HOST == support.HOST == 'localhost'. Looking at the traceback, it is socket that fails, not telnetlib or its test. Hence the clearer title. I am still curious what you propose: catch and skip or something else? For Windows, I consid