On Mon, Feb 11, 2019 at 10:33 AM Tom Lane wrote:
> Thomas Munro writes:
> > So I'll wait until after the release, and we'll have
> > to live with the bug for another 3 months.
>
> Check. Please hold off committing until you see the release tags
> appear, probably late Tuesday my time / Wednesday
Thomas Munro writes:
> On Mon, Feb 11, 2019 at 10:33 AM Tom Lane wrote:
>> I observe from
>> https://coverage.postgresql.org/src/backend/utils/mmgr/freepage.c.gcov.html
>> that the edge cases in this function aren't too well exercised by
>> our regression tests, meaning that the buildfarm might n
On Mon, Feb 11, 2019 at 10:33 AM Tom Lane wrote:
> I observe from
>
> https://coverage.postgresql.org/src/backend/utils/mmgr/freepage.c.gcov.html
>
> that the edge cases in this function aren't too well exercised by
> our regression tests, meaning that the buildfarm might not prove
> much either w
On Mon, Feb 11, 2019 at 11:02 AM Justin Pryzby wrote:
> On Mon, Feb 11, 2019 at 09:45:07AM +1100, Thomas Munro wrote:
> > Ouch. Yeah, that'd do it and matches the evidence. With this change,
> > I couldn't reproduce the problem after 90 minutes with a test case
> > that otherwise hits it within
On Mon, Feb 11, 2019 at 09:45:07AM +1100, Thomas Munro wrote:
> Ouch. Yeah, that'd do it and matches the evidence. With this change,
> I couldn't reproduce the problem after 90 minutes with a test case
> that otherwise hits it within a couple of minutes.
...
> Note that this patch addresses the e
Thomas Munro writes:
> This brings us to a difficult choice: we're about to cut a new
> release, and this could in theory be included. Even though the fix is
> quite convincing, it doesn't seem wise to change such complicated code
> at the last minute, and I know from an off-list chat that that i
On Sun, Feb 10, 2019 at 5:41 PM Robert Haas wrote:
> On Sun, Feb 10, 2019 at 2:37 AM Thomas Munro
> wrote:
> > But at first glance it shouldn't be allocating pages, because it just
> > does consolidation to try to convert to singleton format, and then it
> > does recycle list cleanup using soft=t
On Sun, Feb 10, 2019 at 07:11:22PM +0300, Sergei Kornilov wrote:
> > I ran overnight with this patch, but all parallel processes ended up stuck
> > in
> > the style of bug#15585. So that's either not the root cause, or there's a
> > 2nd
> > issue.
>
> Maybe i missed something in this discussion,
Hi
> I ran overnight with this patch, but all parallel processes ended up stuck in
> the style of bug#15585. So that's either not the root cause, or there's a 2nd
> issue.
Maybe i missed something in this discussion, but you can reproduce bug#15585?
How? With this testcase:
https://www.postgres
On Sun, Feb 10, 2019 at 12:10:52PM +0530, Robert Haas wrote:
> I think I see what's happening. At the moment the problem occurs,
> there is no btree - there is only a singleton range. So
> FreePageManagerInternal() takes the fpm->btree_depth == 0 branch and
> then ends up in the section with the
On Sun, Feb 10, 2019 at 2:37 AM Thomas Munro
wrote:
> ... but why would it do that? I can reproduce cases where (for
> example) FreePageManagerPutInternal() returns 179, and then
> FreePageManagerLargestContiguous() returns 179, but then after
> FreePageBtreeCleanup() it returns 178. At that poi
On Sun, Feb 10, 2019 at 1:55 AM Thomas Munro
wrote:
> Bleugh. Yeah. What I said before wasn't quite right. The value
> returned by FreePageManagerPutInternal() is actually correct at the
> moment it is returned, but it ceases to be correct immediately
> afterwards if the following call to FreeP
On Sun, Feb 10, 2019 at 7:24 AM Thomas Munro
wrote:
> On Sat, Feb 9, 2019 at 9:21 PM Robert Haas wrote:
> > On Fri, Feb 8, 2019 at 8:00 AM Thomas Munro
> > wrote:
> > > Sometimes FreeManagerPutInternal() returns a
> > > number-of-contiguous-pages-created-by-this-insertion that is too large
> > >
On Sat, Feb 9, 2019 at 9:21 PM Robert Haas wrote:
> On Fri, Feb 8, 2019 at 8:00 AM Thomas Munro
> wrote:
> > Sometimes FreeManagerPutInternal() returns a
> > number-of-contiguous-pages-created-by-this-insertion that is too large
> > by one. [...]
>
> I spent a long time thinking about this and st
On Fri, Feb 8, 2019 at 8:00 AM Thomas Munro
wrote:
> Sometimes FreeManagerPutInternal() returns a
> number-of-contiguous-pages-created-by-this-insertion that is too large
> by one. If this happens to be a new max-number-of-contiguous-pages,
> it causes trouble some arbitrary time later because th
On Fri, Feb 8, 2019 at 4:49 AM Thomas Munro
wrote:
> I don't have the answer yet but I have some progress: I finally
> reproduced the "could not find %d free pages" error by running lots of
> concurrent parallel queries. Will investigate.
Sometimes FreeManagerPutInternal() returns a
number-of-co
On Thu, Feb 7, 2019 at 9:10 PM Jakub Glapa wrote:
> > Do you have query logging enabled ? If not, could you consider it on at
> > least
> one of those servers ? I'm interested to know what ELSE is running at the
> time
> that query failed.
>
> Ok, I have configured that and will enable in the
> Do you have query logging enabled ? If not, could you consider it on at
least
one of those servers ? I'm interested to know what ELSE is running at the
time
that query failed.
Ok, I have configured that and will enable in the time window when the
errors usually occur. I'll report as soon as I
Moving to -hackers, hopefully it doesn't confuse the list scripts too much.
On Mon, Feb 04, 2019 at 08:52:17AM +0100, Jakub Glapa wrote:
> I see the error showing up every night on 2 different servers. But it's a
> bit of a heisenbug because If I go there now it won't be reproducible.
Do you have
On Sat, Dec 1, 2018 at 9:46 AM Justin Pryzby wrote:
> elog(FATAL,
> "dsa_allocate could not find %zu free
> pages", npages);
> + abort()
If anyone can reproduce this problem with a debugger, it'd be
interesting to see
On Fri, Nov 30, 2018 at 08:20:49PM +0100, Jakub Glapa wrote:
> In the last days I've been monitoring no segfault occurred but the
> das_allocation did.
> I'm starting to doubt if the segfault I've found in dmesg was actually
> related.
The dmesg looks like a real crash, not just OOM. You can hope
Hi, just a small update.
I've configured the OS for taking crash dumps on Ubuntu 16.04 with the
following (maybe somebody will find it helpful):
I've added LimitCORE=infinity to /lib/systemd/system/postgresql@.service
under [Service] section
I've reloaded the service config with sudo systemctl daem
On Tue, Nov 27, 2018 at 4:00 PM Thomas Munro
wrote:
> Hmm. I will see if I can come up with a many-partition torture test
> reproducer for this.
No luck. I suppose one theory that could link both failure modes
would a buffer overrun, where in the non-shared case it trashes a
pointer that is lat
On Tue, Nov 27, 2018 at 7:45 AM Alvaro Herrera wrote:
> On 2018-Nov-26, Jakub Glapa wrote:
> > Justin thanks for the information!
> > I'm running Ubuntu 16.04.
> > I'll try to prepare for the next crash.
> > Couldn't find anything this time.
>
> As I recall, the appport stuff in Ubuntu is terrible
24 matches
Mail list logo