[Yet another oddity.]
On 2020-Jun-11, at 21:05, Mark Millard wrote:
>
> There is another oddity in the code structure, in
> that if pt was ever NULL the code would misuse the
> NULL before the test for non-NULL is made:
>
>pt = moea_pvo_to_pte(pvo, -1);
> . . .
>
There is another oddity in the code structure, in
that if pt was ever NULL the code would misuse the
NULL before the test for non-NULL is made:
pt = moea_pvo_to_pte(pvo, -1);
. . .
old_pte = *pt;
/*
* If the PVO is in the page table
[Just a better panic backtrace text copy.]
On 2020-Jun-11, at 20:29, Mark Millard wrote:
> On 2020-Jun-11, at 19:25, Justin Hibbits wrote:
>
>> On Thu, 11 Jun 2020 17:30:24 -0700
>> Mark Millard wrote:
>>
>>> On 2020-Jun-11, at 16:49, Mark Millard wrote:
>>>
On 2020-Jun-11, at 14:42,
On 2020-Jun-11, at 19:25, Justin Hibbits wrote:
> On Thu, 11 Jun 2020 17:30:24 -0700
> Mark Millard wrote:
>
>> On 2020-Jun-11, at 16:49, Mark Millard wrote:
>>
>>> On 2020-Jun-11, at 14:42, Justin Hibbits
>>> wrote:
>>>
>>> On Thu, 11 Jun 2020 14:36:37 -0700
>>> Mark Millard wrote:
>>>
On 2020-Jun-11, at 16:49, Mark Millard wrote:
> On 2020-Jun-11, at 14:42, Justin Hibbits wrote:
>
> On Thu, 11 Jun 2020 14:36:37 -0700
> Mark Millard wrote:
>
>> On 2020-Jun-11, at 13:55, Justin Hibbits
>> wrote:
>>
>>> On Wed, 10 Jun 2020 18:56:57 -0700
>>> Mark Millard wrote:
> . . .
>>
On 2020-Jun-11, at 14:42, Justin Hibbits wrote:
On Thu, 11 Jun 2020 14:36:37 -0700
Mark Millard wrote:
> On 2020-Jun-11, at 13:55, Justin Hibbits
> wrote:
>
>> On Wed, 10 Jun 2020 18:56:57 -0700
>> Mark Millard wrote:
. . .
>
>
>> That said, the attached patch effectively copies
>> what's
On Thu, 11 Jun 2020 17:30:24 -0700
Mark Millard wrote:
> On 2020-Jun-11, at 16:49, Mark Millard wrote:
>
> > On 2020-Jun-11, at 14:42, Justin Hibbits
> > wrote:
> >
> > On Thu, 11 Jun 2020 14:36:37 -0700
> > Mark Millard wrote:
> >
> >> On 2020-Jun-11, at 13:55, Justin Hibbits
> >> wrote
On 2020-Jun-11, at 14:41, Brandon Bergren wrote:
> An update from my end: I now have the ability to test dual processor G4 as
> well, now that mine is up and running.
Cool.
FYI:
Dual processors are not required for the
problem to happen: the stress based testing
showed the problem just as
On 2020-Jun-11, at 13:55, Justin Hibbits wrote:
> On Wed, 10 Jun 2020 18:56:57 -0700
> Mark Millard wrote:
>
>> On 2020-May-13, at 08:56, Justin Hibbits wrote:
>>
>>> Hi Mark,
>>
>> Hello Justin.
>
> Hi Mark,
Hello again, Justin.
>>
>>> On Wed, 13 May 2020 01:43:23 -0700
>>> Mark Mi
On Thu, 11 Jun 2020 14:36:37 -0700
Mark Millard wrote:
> On 2020-Jun-11, at 13:55, Justin Hibbits
> wrote:
>
> > On Wed, 10 Jun 2020 18:56:57 -0700
> > Mark Millard wrote:
> >
> >> On 2020-May-13, at 08:56, Justin Hibbits
> >> wrote:
> >>> Hi Mark,
> >>
> >> Hello Justin.
> >
> >
An update from my end: I now have the ability to test dual processor G4 as
well, now that mine is up and running.
On Thu, Jun 11, 2020, at 4:36 PM, Mark Millard wrote:
>
> How did you test?
>
> In my context it was far easier to see the problem
> with builds that did not use MALLOC_PRODUCTION.
On Wed, 10 Jun 2020 18:56:57 -0700
Mark Millard wrote:
> On 2020-May-13, at 08:56, Justin Hibbits wrote:
>
> > Hi Mark,
>
> Hello Justin.
Hi Mark,
>
> > On Wed, 13 May 2020 01:43:23 -0700
> > Mark Millard wrote:
> >
> >> [I'm adding a reference to an old arm64/aarch64 bug that had
>
On 2020-May-13, at 08:56, Justin Hibbits wrote:
> Hi Mark,
Hello Justin.
> On Wed, 13 May 2020 01:43:23 -0700
> Mark Millard wrote:
>
>> [I'm adding a reference to an old arm64/aarch64 bug that had
>> pages turning to zero, in case this 32-bit powerpc issue is
>> somewhat analogous.]
>>
>
Hi Mark,
On Wed, 13 May 2020 01:43:23 -0700
Mark Millard wrote:
> [I'm adding a reference to an old arm64/aarch64 bug that had
> pages turning to zero, in case this 32-bit powerpc issue is
> somewhat analogous.]
>
> On 2020-May-13, at 00:29, Mark Millard wrote:
>
> > [stress alone is sufficie
[I'm adding a reference to an old arm64/aarch64 bug that had
pages turning to zero, in case this 32-bit powerpc issue is
somewhat analogous.]
On 2020-May-13, at 00:29, Mark Millard wrote:
> [stress alone is sufficient to have jemalloc asserts fail
> in stress, no need for a multi-socket G4 eithe
[stress alone is sufficient to have jemalloc asserts fail
in stress, no need for a multi-socket G4 either. No need
to involve nfsd, mountd, rpcbind or the like. This is not
a claim that I know all the problems to be the same, just
that a jemalloc reported failure in this simpler context
happens and
[Yet another new kind of experiment. But this looks
like I can cause problems in fairly sort order on
demand now. Finally! And with that I've much better
evidence for kernel vs. user-space process for making
the zeroed memory appear in, for example, nfsd.]
I've managed to get:
:
/usr/src/contrib
[A new kind of experiment and partial results.]
Given the zero'ed memory page(s) that for some of
the example contexts include a page that should not
be changing after initialization in my context
(jemalloc global variables), I have attempted the
following for such examples:
A) Run gdb
B) Attach
[I caused nfsd to having things shifted in mmeory some to
see it it tracked content vs. page boundary for where the
zeros stop. Non-nfsd examples omitted.]
> . . .
>> nfsd hit an assert, failing ret == sz_size2index_compute(size)
>
> [Correction: That should have referenced sz_index2size_lookup(i
[More details for a sshd failure. The other
examples are omitted. The sshd failure also shows
a all-zeros-up-to-a-page-boundary issue, just for
a different address range.]
On 2020-May-7, at 12:06, Mark Millard wrote:
>
> [mountd failure example: also at sz_size2index_lookup assert
> for the same
[mountd failure example: also at sz_size2index_lookup assert
for the same zero'd memory problem.]
> On 2020-May-7, at 00:46, Mark Millard wrote:
>
> [__je_sz_size2index_tab seems messed up in 2 of the
> asserting contexts: first 384 are zero in both. More
> before that is also messed up (all zer
[__je_sz_size2index_tab seems messed up in 2 of the
asserting contexts: first 384 are zero in both. More
before that is also messed up (all zero). I show the
details later below.]
On 2020-May-6, at 16:57, Mark Millard wrote:
> [This explores process crashes that happen during system
> shutdown,
[This explores process crashes that happen during system
shutdown, in a context not having MALLOC_PRODUCTION= .
So assert failures are reported as the stopping points.]
It looks like shutdown -p now, shutdown -r now, and the
like can lead some processes to assert during their exit
attempt, includi
[This report just shows an interesting rpcbind crash:
a pointer was filled with part of a string instead,
leading to a failed memory access attempt from the junk
address produced.]
Core was generated by `/usr/sbin/rpcbind'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x5024405c
[This report just shows some material for the
sendmail SIGSEGV's, based on truss output.]
I've returned to using the modern jemalloc because
it seems to show problems more, after having
caught the earlier reported dhclient example under
the older jemalloc context. (Again: jemalloc may
be exposing
This note just reports things from looking at 2
.core files (mountd and rpcbind) from the new
jemalloc context's failures. May be someone that
knows more can get something out of it. I've
not included any of the prior message history.
For mountd:
Core was generated by `/usr/sbin/mountd -r'.
Prog
[The bit argument ot bitmap_unset seems to be way
too large.]
On 2020-May-3, at 11:08, Mark Millard wrote:
> [At around 4AM local time dhcient got a signal 11,
> despite the jemalloc revert. The other exmaples
> have not happened.]
>
> On 2020-May-2, at 18:46, Mark Millard wrote:
>
>> [I'm on
[At around 4AM local time dhcient got a signal 11,
despite the jemalloc revert. The other exmaples
have not happened.]
On 2020-May-2, at 18:46, Mark Millard wrote:
> [I'm only claiming the new jemalloc is involved and that
> reverting avoids the problem.]
>
> I've been reporting to some lists p
On 2020-May-3, at 01:26, nonameless at ukr.net wrote:
> --- Original message ---
> From: "Mark Millard"
> Date: 3 May 2020, 04:47:14
>
>
>
>> [I'm only claiming the new jemalloc is involved and that
>> reverting avoids the problem.]
>>
>> I've been reporting to some lists problems with:
--- Original message ---
From: "Mark Millard"
Date: 3 May 2020, 17:38:14
>
>
> On 2020-May-3, at 01:26, nonameless at
> ukr.net wrote:
>
>
>
>
> > --- Original message ---
> > From: "Mark Millard"
> > Date: 3 May 2020, 04:47:14
> >
> >
> >
> >> [I'm only claiming the new jem
--- Original message ---
From: "Mark Millard"
Date: 3 May 2020, 04:47:14
> [I'm only claiming the new jemalloc is involved and that
> reverting avoids the problem.]
>
> I've been reporting to some lists problems with:
>
> dhclient
> sendmail
> rpcbind
> mountd
> nfsd
>
> getting SIG
[I'm only claiming the new jemalloc is involved and that
reverting avoids the problem.]
I've been reporting to some lists problems with:
dhclient
sendmail
rpcbind
mountd
nfsd
getting SIGSEGV (signal 11) crashes and some core
dumps on the old 2-socket (1 core per socket) 32-bit
PowerMac G4 runnin
32 matches
Mail list logo