Bug#596291: offlineimap getting stuck when run with multiple threads/connections

2010-09-13 Thread Stephen Gran
Hi,

A couple of final notes.  I get this hang even when running with -1, and
I only get this hang when on the same LAN as the IMAP server.  When
fetching remotely, it's fine.  Again, this points to a timing dependant
bug that is only tickled when the sync isn't being slowed down by
latency or something else.

Sorry I'm not being more helpful.

Cheers,
-- 
 -
|   ,''`.Stephen Gran |
|  : :' :sg...@debian.org |
|  `. `'Debian user, admin, and developer |
|`- http://www.debian.org |
 -


signature.asc
Description: Digital signature


Bug#596291: offlineimap getting stuck when run with multiple threads/connections

2010-09-13 Thread Stephen Gran
This one time, at band camp, Stephen Gran said:
 I have so far been unable to reproduce this when running offlineimap
 under strace, which suggests a timing dependant bug (kind of as
 suspected, but still not all that helpful, I realize).  The earlier
 strace (and the strace I see when attaching after it hangs) does suggest
 that a lock is not being releaesed somewhere, but I can't immediately
 see anything obviously wrong with the logic in the code.

Trying to run with -d thread under python2.6 gets me a recursion
exceeded exception, but I did manage to capture a lockup under 2.5:

Folder sync Lobefin[packaging.clamav]: threading._BoundedSemaphore object at 
0x7f0c925d3210.acquire: success, value=0
Folder sync Lobefin[packaging.clamav]: InstanceLimitedThread(Copy message 2200 
from packaging.clamav, initial daemon).start(): starting thread
Copy message 2200 from packaging.clamav: InstanceLimitedThread(Copy message 
2200 from packaging.clamav, started daemon).__bootstrap(): thread started
Copy message 2200 from packaging.clamav:
   Copy message 2200 IMAP[packaging.clamav] - Maildir[packaging.clamav], 
LocalStatus[packaging.clamav]
Copy message 2200 from packaging.clamav: threading._BoundedSemaphore object at 
0x7f0c91080e10.acquire: success, value=0
Folder sync Lobefin[packaging.clamav]: threading._BoundedSemaphore object at 
0x7f0c91080e10.acquire(1): blocked waiting, value=0
Copy message 2200 from packaging.clamav: threading._BoundedSemaphore object at 
0x7f0c91080e10.release: success, value=1
Copy message 2200 from packaging.clamav: Condition(thread.lock object at 
0x7f0c9d86f288, 1).notify(): notifying 1 waiter
Folder sync Lobefin[packaging.clamav]: Condition(thread.lock object at 
0x7f0c9d86f288, 0).wait(): got it
Folder sync Lobefin[packaging.clamav]: threading._BoundedSemaphore object at 
0x7f0c91080e10.acquire: success, value=0
Folder sync Lobefin[packaging.clamav]: threading._BoundedSemaphore object at 
0x7f0c91080e10.release: success, value=1
Folder sync Lobefin[packaging.clamav]: Condition(thread.lock object at 
0x7f0c9d86f288, 0).notify(): no waiters
Folder sync Lobefin[packaging.clamav]: threading._BoundedSemaphore object at 
0x7f0c925d3210.acquire(1): blocked waiting, value=0
Copy message 2200 from packaging.clamav: threading._BoundedSemaphore object at 
0x7f0c925d3210.release: success, value=1
Copy message 2200 from packaging.clamav: Condition(thread.lock object at 
0x7f0c9d86f1e0, 1).notify(): notifying 1 waiter
Copy message 2200 from packaging.clamav: InstanceLimitedThread(Copy message 
2200 from packaging.clamav, started daemon).__bootstrap(): normal return
Folder sync Lobefin[packaging.clamav]: Condition(thread.lock object at 
0x7f0c9d86f1e0, 0).wait(): got it
Copy message 2200 from packaging.clamav: Condition(thread.lock object at 
0x7a81b40, 0).notify(): no waiters
Folder sync Lobefin[packaging.clamav]: threading._BoundedSemaphore object at 
0x7f0c925d3210.acquire: success, value=0
Folder sync Lobefin[packaging.clamav]: InstanceLimitedThread(Copy message 2201 
from packaging.clamav, initial daemon).start(): starting thread
Copy message 2201 from packaging.clamav: InstanceLimitedThread(Copy message 
2201 from packaging.clamav, started daemon).__bootstrap(): thread started
Copy message 2201 from packaging.clamav:
   Copy message 2201 IMAP[packaging.clamav] - Maildir[packaging.clamav], 
LocalStatus[packaging.clamav]
Copy message 2201 from packaging.clamav: threading._BoundedSemaphore object at 
0x7f0c91080e10.acquire: success, value=0
Folder sync Lobefin[packaging.clamav]: threading._BoundedSemaphore object at 
0x7f0c91080e10.acquire(1): blocked waiting, value=0
Copy message 2201 from packaging.clamav: threading._BoundedSemaphore object at 
0x7f0c91080e10.release: success, value=1
Copy message 2201 from packaging.clamav: Condition(thread.lock object at 
0x7f0c9d86f288, 1).notify(): notifying 1 waiter
Folder sync Lobefin[packaging.clamav]: Condition(thread.lock object at 
0x7f0c9d86f288, 0).wait(): got it
Folder sync Lobefin[packaging.clamav]: threading._BoundedSemaphore object at 
0x7f0c91080e10.acquire: success, value=0
Folder sync Lobefin[packaging.clamav]: threading._BoundedSemaphore object at 
0x7f0c91080e10.release: success, value=1
Folder sync Lobefin[packaging.clamav]: Condition(thread.lock object at 
0x7f0c9d86f288, 0).notify(): no waiters
Folder sync Lobefin[packaging.clamav]: threading._BoundedSemaphore object at 
0x7f0c925d3210.acquire(1): blocked waiting, value=0
Copy message 2201 from packaging.clamav: threading._BoundedSemaphore object at 
0x7f0c925d3210.release: success, value=1
Copy message 2201 from packaging.clamav: Condition(thread.lock object at 
0x7f0c9d86f1e0, 1).notify(): notifying 1 waiter
Copy message 2201 from packaging.clamav: InstanceLimitedThread(Copy message 
2201 from packaging.clamav, started daemon).__bootstrap(): normal return
Folder sync Lobefin[packaging.clamav]: Condition(thread.lock object at 
0x7f0c9d86f1e0, 0).wait(): got it
Copy message 2201 from 

Bug#596291: offlineimap getting stuck when run with multiple threads/connections

2010-09-11 Thread Stephen Gran
This one time, at band camp, Rogério Brito said:
 As I reported in another bug (about offlineimap dying too easily under
 many circumstances), attached is what I see when I strace one of the
 running threads of offlineimap when it is run with 10 connections to
 initially mirror my gmail account to a local maildir.
 
 I am not trying anymore to push the messages to the gmail server---I am
 now trying to pull a fresh copy of them from there.
 
 If there is anything that would be desired and if I can reproduce it,
 then, please, let me know.

Hi,

Just a me too report.  I'm seeing it die after fetching a few thousand
mails here, although the time to hang is variable.  I've tried running
it against both the lenny dovecot and the current bpo dovecot, and it's
the same.  I've also tried with both python 2.6 and 2.5 to see if the
recent python transition is what's broken it, but it also hangs with 2.5.
Below is the end of the debug imap output.  Trying to run debug thread
(which is where it seems to actually hang) throws a recursion exceeded
exception, which is slightly unhelpful :)

Thanks for looking at this.

Copy message 3166 from lists.clamav-users:
   Copy message 3166 IMAP[lists.clamav-users] - Maildir[lists.clamav-users], 
LocalStatus[lists.clamav-users]
Copy message 3166 from lists.clamav-users:
   DEBUG[imap]:   49:23.94  GAHI1199 UID FETCH 3166 (BODY.PEEK[])


The server is just sat in:
epoll_wait(5, {}, 4, 5000)  = 0

So it looks like they've somehow got out of sync with each other.  I'm
assuming the server thinks it's answered the last request, and
offlineimap thinks it's made the last request, so they're deadlocked
waiting for the other side to do something.

Cheers,
-- 
 -
|   ,''`.Stephen Gran |
|  : :' :sg...@debian.org |
|  `. `'Debian user, admin, and developer |
|`- http://www.debian.org |
 -


signature.asc
Description: Digital signature


Bug#596291: offlineimap getting stuck when run with multiple threads/connections

2010-09-11 Thread Stephen Gran
This one time, at band camp, Stephen Gran said:
 Copy message 3166 from lists.clamav-users:
Copy message 3166 IMAP[lists.clamav-users] - Maildir[lists.clamav-users], 
 LocalStatus[lists.clamav-users]
 Copy message 3166 from lists.clamav-users:
DEBUG[imap]:   49:23.94  GAHI1199 UID FETCH 3166 (BODY.PEEK[])
 
 The server is just sat in:
 epoll_wait(5, {}, 4, 5000)  = 0

I've managed to correlate the server side and the client side a bit
more:

Copy message 2099 from debian.amd64.2007:
   Copy message 2099 IMAP[debian.amd64.2007] - Maildir[debian.amd64.2007], 
LocalStatus[debian.amd64.2007]
Copy message 2100 from debian.amd64.2007:
   Copy message 2100 IMAP[debian.amd64.2007] - Maildir[debian.amd64.2007], 
LocalStatus[debian.amd64.2007]


But the server sees:

[pid 12586] read(0, BHAJ7358 UID FETCH 2099 (BODY.PEEK[])\r\n..., 3029) = 39
[pid 12586] write(1, * 2099 FETCH (UID 2099 BODY[] {5143}\r\nReturn-path: 
, 4094) = 4094
[pid 12586] write(1, BHAJ7358 OK Fetch completed.\r\n..., 1120) = 1120
[pid 12586] epoll_wait(5,  unfinished ...
[pid 12568] ... epoll_wait resumed {}, 14, 1000) = 0
[pid 12568] epoll_wait(10,  unfinished ...
[pid 12586] ... epoll_wait resumed {}, 3, 517) = 0
[pid 12586] epoll_wait(5,  unfinished ...
[pid 12568] ... epoll_wait resumed {}, 14, 1000) = 0

and there it hangs.  So it looks like the client thinks it's sent a
FETCH, but the server never sees it.  Hope this is helpful.

I have so far been unable to reproduce this when running offlineimap
under strace, which suggests a timing dependant bug (kind of as
suspected, but still not all that helpful, I realize).  The earlier
strace (and the strace I see when attaching after it hangs) does suggest
that a lock is not being releaesed somewhere, but I can't immediately
see anything obviously wrong with the logic in the code.

Cheers,
-- 
 -
|   ,''`.Stephen Gran |
|  : :' :sg...@debian.org |
|  `. `'Debian user, admin, and developer |
|`- http://www.debian.org |
 -


signature.asc
Description: Digital signature


Bug#596291: offlineimap getting stuck when run with multiple threads/connections

2010-09-09 Thread Rogério Brito
Package: offlineimap
Severity: important

As I reported in another bug (about offlineimap dying too easily under
many circumstances), attached is what I see when I strace one of the
running threads of offlineimap when it is run with 10 connections to
initially mirror my gmail account to a local maildir.

I am not trying anymore to push the messages to the gmail server---I am
now trying to pull a fresh copy of them from there.

If there is anything that would be desired and if I can reproduce it,
then, please, let me know.


Regards,

-- 
Rogério Brito : rbr...@{ime.usp.br,gmail.com} : GPG key 1024D/7C2CAEB8
http://rb.doesntexist.org : Packages for LaTeX : algorithms.berlios.de
DebianQA: http://qa.debian.org/developer.php?login=rbrito%40ime.usp.br

futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
read(12, , 5) = 0
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
read(12, , 5) = 0
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
read(12, , 5) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
read(12, , 5) = 0
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
read(12, , 5) = 0
futex(0xa2aa1b8, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
read(12, , 5) = 0
futex(0xa2aa1b8, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
read(12, , 5) = 0
futex(0xa2aa1b8, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
read(12, , 5) = 0
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
read(12, , 5) = 0
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
read(12, , 5) = 0
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xa2aa1b8, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0xa317238, FUTEX_WAIT_PRIVATE, 0, NULL) = 0