It is a regular guess, but this method is guareenteed to be called only in the
main thread. Thread pool usage is very limited in this program because it is
coroutine-based.
2017-12-22
hubo
发件人:William ML Leslie <william.leslie....@gmail.com>
发送时间:2017-12-19 20:14
主题:Re: [pypy-dev] Mysterious IndexError in service running with PyPy
收件人:"hubo"<h...@jiedaibao.com>
抄送:"PyPy Developer Mailing List"<pypy-dev@python.org>
Where is the code that changes the size of self.heap? How do we know that
size(self.heap) is constant? My guess is that some thread changes this; but l
is not recomputed.
On 18 Dec 2017 6:59 PM, "hubo" <h...@jiedaibao.com> wrote:
I'm reporting this issue in this mail group, though I don't know if it is
related with PyPy, because it is really strange, and is not able to reproduce
stably. But I hope someone may know the reason or have some points.
I'm running some SDN services written in Python with PyPy 5.3.0. The related
code is here:
https://github.com/hubo1016/vlcp/blob/8022e3a3c67cf4305af503d507640a730ca3949d/vlcp/utils/indexedheap.py#L100
The full code is also in the repo, but may be too complex to describe. But this
related piece is quite simple:
def _siftdown(self, pos):
temp = self.heap[pos]
l = len(self.heap)
while pos * 2 + 1 < l:
cindex = pos * 2 + 1
pt = self.heap[cindex]
if cindex + 1 < l and self.heap[cindex+1][0] < pt[0]:
cindex = cindex + 1
pt = self.heap[cindex]
if pt[0] < temp[0]:
self.heap[pos] = pt
self.index[pt[1]] = pos
else:
break
pos = cindex
self.heap[pos] = temp
self.index[temp[1]] = pos
It is a simple heap operation. The service uses a heap to process timers. When
the service is not busy, it usually runs this piece of code several times per
minute.
I have 32 servers running this service. They are quite stable in about three
months, but one day one of the services crashes on line 100 reporting
IndexError:
pt = self.heap[cindex]
As you can see, cindex = pos * 2 + 1, which is tested by the while pre-conditon
just two lines before. And there is not any multi-threading issues here because
this piece of code always runs in the same thread. So it is not possible in
theory for this to happen.
Only the following facts are known about this issue:
It reproduces - through quite rarely. I've met this kind of crashes 4 times,
each with different machine, so it should not be related to hardware issuess.
Since I've got 32 servers, it might take more than one year to reproduce with a
single server.
It seems not to be related to pressures. All of the crashes happens at night,
when there are little requests. Only some cleanup tasks are running in fixed
interval.
The services are running with PyPy 5.3.0. I've upgraded a few of them to 5.9,
but it will take a long time to validate whether this still happens. And It is
not validated on CPython too. I'm also trying to collect more debugging
information for this issue, but it is very hard since it rarely reproduces.
It is not a serious issue. It could be workarounded with a auto-restart, but
I'm searching the cause.
2017-12-18
hubo
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev