Where is the code that changes the size of self.heap? How do we know that
size(self.heap) is constant? My guess is that some thread changes this; but
l is not recomputed.

On 18 Dec 2017 6:59 PM, "hubo" <h...@jiedaibao.com> wrote:

> I'm reporting this issue in this mail group, though I don't know if it is
> related with PyPy, because it is really strange, and is not able to
> reproduce stably. But I hope someone may know the reason or have some
> points.
>
> I'm running some SDN services written in Python with PyPy 5.3.0.  The
> related code is here:
>
> https://github.com/hubo1016/vlcp/blob/8022e3a3c67cf4305af503d507640a730ca3949d/vlcp/utils/indexedheap.py#L100
>
> The full code is also in the repo, but may be too complex to describe. But
> this related piece is quite simple:
>
>     def _siftdown(self, pos):
>         temp = self.heap[pos]
>         l = len(self.heap)
>         while pos * 2 + 1 < l:
>             cindex = pos * 2 + 1
>             pt = self.heap[cindex]
>             if cindex + 1 < l and self.heap[cindex+1][0] < pt[0]:
>                 cindex = cindex + 1
>                 pt = self.heap[cindex]
>             if pt[0] < temp[0]:
>                 self.heap[pos] = pt
>                 self.index[pt[1]] = pos
>             else:
>                 break
>             pos = cindex
>         self.heap[pos] = temp
>         self.index[temp[1]] = pos
> It is a simple heap operation. The service uses a heap to process timers.
> When the service is not busy, it usually runs this piece of code several
> times per minute.
>
> I have 32 servers running this service. They are quite stable in about
> three months, but one day one of the services crashes on line 100 reporting
> IndexError:
>      pt = self.heap[cindex]
>
> As you can see, cindex = pos * 2 + 1, which is tested by the while
> pre-conditon just two lines before. And there is not any multi-threading
> issues here because this piece of code always runs in the same thread. So
> it is not possible in theory for this to happen.
>
> Only the following facts are known about this issue:
>
>    1. It reproduces  - through quite rarely. I've met this kind of
>    crashes 4 times, each with different machine, so it should not be related
>    to hardware issuess. Since I've got 32 servers, it might take more than one
>    year to reproduce with a single server.
>    2. It seems not to be related to pressures. All of the crashes happens
>    at night, when there are little requests. Only some cleanup tasks are
>    running in fixed interval.
>
> The services are running with PyPy 5.3.0. I've upgraded a few of them to
> 5.9, but it will take a long time to validate whether this still happens.
> And It is not validated on CPython too. I'm also trying to collect more
> debugging information for this issue, but it is very hard since it rarely
> reproduces.
>
> It is not a serious issue. It could be workarounded with a auto-restart,
> but I'm searching the cause.
>
>
> 2017-12-18
> ------------------------------
> hubo
>
> _______________________________________________
> pypy-dev mailing list
> pypy-dev@python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
>
>
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev

Reply via email to