On Tue, Jan 27, 2009 at 10:29:19AM +0100, Daniel Schreiber wrote:
> Iustin Pop schrieb:
>> On Mon, Jan 26, 2009 at 02:28:51PM +0100, Daniel Schreiber wrote:
>>>
>>> It blocks forever after printing the twisted version.
>>
>> This is strange then. The breakage occurs even before printin the first 
>> result,
>> which means it does not work at all (so it's different from the bug that we
>> fixed in 1.2.5).
>>
>> Just to be sure, can you confirm this also happens when trying against
>> localhost? And do you get anything in /var/log/ganeti/node-daemon.log?
>
> Traceback (most recent call last):
>   File "/usr/sbin/ganeti-noded", line 635, in <module>
>     main()
>   File "/usr/sbin/ganeti-noded", line 631, in main
>     reactor.run()
>   File "/usr/lib/python2.5/site-packages/twisted/internet/base.py", line 
> 1048, in run
>     self.mainLoop()
> --- <exception caught here> ---
>   File "/usr/lib/python2.5/site-packages/twisted/internet/base.py", line 
> 1060, in mainLoop
>     self.doIteration(t)
>   File  
> "/usr/lib/python2.5/site-packages/twisted/internet/selectreactor.py",  
> line 126, in doSelect
>     self._preenDescriptors()
>   File  
> "/usr/lib/python2.5/site-packages/twisted/internet/selectreactor.py",  
> line 88, in _preenDescriptors
>     self._disconnectSelectable(selectable, e, False)
>   File "/usr/lib/python2.5/site-packages/twisted/internet/posixbase.py", 
> line 196, in _disconnectSelectable
>     selectable.connectionLost(failure.Failure(why))
>   File "/usr/lib/python2.5/site-packages/twisted/internet/posixbase.py", 
> line 150, in connectionLost
>     os.close(fd)
> exceptions.OSError: [Errno 9] Bad file descriptor
>
> This happens right after startup of the node-daemon. Same trace on all  
> nodes. Nothing else is logged after that.
>
>> I would be interested at this point in:
>>   - the output of "python -v call_version.py ...."
>>   - an strace of "python call_version.py ..."
>>   - an strace of ganeti-noded while doing the above
>
> Attached.

Thanks. This is starting to look like a different problem.

The errno 9 we had before, but it should be solved. The strace shows
that the ganeti-noded is not actually listening, and the call_version
talks to a different host - it completes a connect(), while the node
daemon doesn't get any traffic.

I'll try to understand why you get the errno 9 error, but in the
meantime also a "lsof -p $pid_of_node_daemon" and an strace of the node
daemon startup would be helpful to understand in what state a node
daemon is.

iustin



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to