[DNG] Some RAID1's inaccessible after upgrade to beowulf from ascii

2021-11-09 Thread Hendrik Boom via Dng
I upgraded my server to beowulf.

After rebooting, all home directories except root's are no longer
accessible.

They are all on an LVM on software RAID.

The problem seems to be that two of my three RAID1 systems are not
starting up properly.  What can I do about it?



hendrik@april:/$ cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5]
[raid4] [raid10]
md1 : inactive sda2[3](S)
  2391296000 blocks super 1.2

md2 : inactive sda3[0](S)
  1048512 blocks

md0 : active raid1 sdf4[1]
  706337792 blocks [2/1] [_U]

unused devices: 
hendrik@april:/$



hendrik@april:/$ cat /etc/mdadm/mdadm.conf
DEVICE partitions
ARRAY /dev/md0 level=raid1 num-devices=2
UUID=4dc189ba:e7a12d38:e6262cdf:db1beda2
ARRAY /dev/md1 metadata=1.2 name=april:1
UUID=c328565c:16dce536:f16da6e2:db603645
ARRAY /dev/md2 UUID=5d63f486:183fd2ea:c2a3a88f:cb2b61de
MAILADDR root
hendrik@april:/$



The standard recommendation seems to be to replace lines
in /etc/mdadm/mdadm.conf by lines prouced by mdadm --examine --scan:



april:~# mdadm --examine --scan
ARRAY /dev/md/1  metadata=1.2 UUID=c328565c:16dce536:f16da6e2:db603645
name=april:1
ARRAY /dev/md2 UUID=5d63f486:183fd2ea:c2a3a88f:cb2b61de
ARRAY /dev/md0 UUID=4dc189ba:e7a12d38:e6262cdf:db1beda2
april:~#



But this replacement involves changing a line that dies work (md0),
not changing one that did not (md2),
and changing another one that did not work (md1).

Since --examine's suggested changes seem uncorrelated
with the active/inactive record, I have little faith in
this alleged fix without first gaining more understanding.

-- hendrik
___
Dng mailing list
Dng@lists.dyne.org
https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng


Re: [DNG] Is it dead yet? -- SOLVED

2021-10-27 Thread Hendrik Boom via Dng
Not dead, but just injured.
Replaced two 312M RAM cards with one faster 2G RAM card.
Old RAM defective; tech told me that its speed was actually mismatched to
the rest of the system, slowing it down.  Of course "defective" was the
crucial issue, not speed.
Machine works fine now, running devuan ascii.

Next plan: upgrade to current chaemera in stages

Thank you for all the testing suggestions.  They really helped narrow down
the problem.  Not mention moral support.

-- hendrik
___
Dng mailing list
Dng@lists.dyne.org
https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng


Re: [DNG] Is it dead yet?

2021-10-26 Thread Hendrik Boom via Dng
Chimaera didn't help, though the symptoms changed.
I booted from the chimaera live desktop, and it started looking hopeful,
but before much was running it had a kernel panic -- attempt to kill init.
Yes, I checked the checksum on the download.
And when I used chimaera to boot another computer (a Purism laptop) it
worked flawlessly.

I'm once more suspecting hardware.
Next stop -- memtest.

And memtest shows an enormous number of memory failures.  Which makes me
suspect something systemic, not just a bit here or there.

I'll have to replace the RAM, I guess.  Or find out what memory bus is
failing.

-- hendrik

On Tue, Oct 26, 2021 at 11:26 AM tempforever  wrote:

> Download links are available on the devuan download page
> https://www.devuan.org/get-devuan
>
> choose a mirror (for example, mirror.leaseweb.com/devuan/) then navigate
> to the devuan_chimaera/ directory where you will hopefully find a
> desktop-live/ directory which should contain
> devuan_chimaera_4.0.0_[amd64|i386]_desktop-live.iso along with the
> shasums files.  If you prefer not to use http(s), there are ftp mirrors
> and torrent on that same download page.
>
> Hendrik Boom via Dng wrote:
> > Then I'd have to figure out how to get a new kernel into it while it
> crashes during bootup, > which it now does quite consistently. > > If I had
> made recent
> changes (such as upgrades) I'd consider a kernel problem more likely. >
> But it has been running for months with only occasional problems, and
> now I get problems > every time I boot.  Since I haven't done an upgrade
> at all yet this month, I'll suspect hardware. > > Still, I will try,
> say, a chimaera live CD if I can find one.  Where are the Devuan live
> CD's? > I seem to remember that they were called something other than
> "devuan". > (Yes, I'm aware that the live CD will likely be a USB stick)
> > > -- hendrik > > On Tue, Oct 26, 2021 at 7:47 AM Boian Bonev
> mailto:bbo...@ipacct.com>> wrote: >
> ___
> Dng mailing list
> Dng@lists.dyne.org
> https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng
>
___
Dng mailing list
Dng@lists.dyne.org
https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng


Re: [DNG] Is it dead yet?

2021-10-26 Thread Hendrik Boom via Dng
Then I'd have to figure out how to get a new kernel into it while it
crashes during bootup,
which it now does quite consistently.

If I had made recent changes (such as upgrades) I'd consider a kernel
problem more likely.
But it has been running for months with only occasional problems, and now I
get problems
every time I boot.  Since I haven't done an upgrade at all yet this month,
I'll suspect hardware.

Still, I will try, say, a chimaera live CD if I can find one.  Where are
the Devuan live CD's?
I seem to remember that they were called something other than "devuan".
(Yes, I'm aware that the live CD will likely be a USB stick)

-- hendrik

On Tue, Oct 26, 2021 at 7:47 AM Boian Bonev  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> Hi Hendrik,
>
> It is clearly saying that is a soft lockup, if it was a hardware problem it
> would say hard lockup isn't it? 
>
> That problem really does not indicate a hardware problem, most probably a
> bug
> in the kernel; it can also show when a kernel driver is waiting for e.g.
> hard
> drive that is slow to respond. Try upgrading to a more recent kernel (e.g.
> 5.10.x or 5.14.x) - maybe the problem is already fixed there? It may also
> be a
> temporary condition - a very rare edge case that made the kernel lockup,
> then a
> reboot will help to clean the state...
>
> With best regards,
> b.
>
> On Mon, 2021-10-25 at 20:28 -0400, Hendrik Boom via Dng wrote:
> > I think I may have a clue to the mysterious stoppages of my server.
> > I ssh'd into it today and got the following.  It is time to give up on
> this
> > machine and replace it?
> >
> > -- hendrik
> >
> >
> > hendrik@april:~$ su -
> > Password:
> > april:~# ps | grep lighttpd
> > ^C^C
> > Message from syslogd@april at Oct 25 20:15:48 ...
> >  kernel:[ 5700.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
> 22s!
> > [ps:9184]
> >
> > Message from syslogd@april at Oct 25 20:16:16 ...
> >  kernel:[ 5728.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
> 22s!
> > [ps:9184]
> >
> > Message from syslogd@april at Oct 25 20:16:52 ...
> >  kernel:[ 5764.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
> 23s!
> > [ps:9184]
> >
> >
> > Message from syslogd@april at Oct 25 20:17:20 ...
> >  kernel:[ 5792.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
> 22s!
> > [ps:9184]
> >
> > Message from syslogd@april at Oct 25 20:17:56 ...
> >  kernel:[ 5828.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
> 22s!
> > [ps:9184]
> >
> > Message from syslogd@april at Oct 25 20:18:24 ...
> >  kernel:[ 5856.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
> 22s!
> > [ps:9184]
> >
> > Message from syslogd@april at Oct 25 20:18:56 ...
> >  kernel:[ 5888.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
> 22s!
> > [ps:9184]
> >
> > Message from syslogd@april at Oct 25 20:19:24 ...
> >  kernel:[ 5916.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
> 22s!
> > [ps:9184]
> >
> > Message from syslogd@april at Oct 25 20:20:00 ...
> >  kernel:[ 5952.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
> 22s!
> > [ps:9184]
> >
> > Message from syslogd@april at Oct 25 20:20:28 ...
> >  kernel:[ 5980.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
> 22s!
> > [ps:9184]
> >
>
> -BEGIN PGP SIGNATURE-
>
> iQIzBAEBCgAdFiEEumC8IPN+WURNbSUAE2VyCRPS8i0FAmF36qsACgkQE2VyCRPS
> 8i13tA//Yc6FzSg/wMjAm1am11kLHIzp31y0v+59ZGuXtHu7h1gatrYrnRAUwnRM
> MlaSFkiLO0HWr03db/6GaI0IJsWUWg9hNHuWlbBiQ0fZifCU8zjLLA2nU947t25d
> gBFshWFVqwZJ6rEBguZFZpg9AlGe/Ppslc//ONs2C3abvLzfVn/owWbH5Mx8+O6u
> raqVYutH7D/qD04I3JNyPQsWURjpVxES579cI1uqG+9wZ6L946YDChH6s4B8an+u
> QXLZZyPm4wLapSYuxjM4Es5UmtzN9QY59UtavTnrAPqNtudyuNBPtSfzHtycQsa8
> H0BgxYp2IQxwFQMzAaFTehpQ/eUTqt4DEta40LLK2GZkKs21259RTXTE8LM1Aqrn
> T+WsBO5yg+UoxW/mzyipk7nUwzFYA1xUjYvVKyZEXsIyufrxPCvFCuPmV28XiVPu
> q0x0hfp7/7vxN8khMBIOeWROM6cEC8CV9wPgVhj4Y7vmHJou7tokB/XD4ugvf5U6
> Ym5c+WQjXiDXVUoYMqLZNs75N5UV3AREq52mr8aPpu6FxCUDjKEywxSBpjuQu7Ch
> 6u5B6BxfWw8i/nXb0lAVEYpCp0xHQcqZvVBidEnle2QQ0djV0vNaNUS6vBbsa8F2
> cJSIyWcmHDRkdCv1mj7ot1OpuxUmvBFbsl1sAaFWUCfwVMqqGSE=
> =Bbpj
> -END PGP SIGNATURE-
>
> ___
> Dng mailing list
> Dng@lists.dyne.org
> https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng
>
___
Dng mailing list
Dng@lists.dyne.org
https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng


[DNG] Is it dead yet?

2021-10-25 Thread Hendrik Boom via Dng
I think I may have a clue to the mysterious stoppages of my server.
I ssh'd into it today and got the following.  It is time to give up on this
machine and replace it?

-- hendrik


hendrik@april:~$ su -
Password:
april:~# ps | grep lighttpd
^C^C
Message from syslogd@april at Oct 25 20:15:48 ...
 kernel:[ 5700.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
22s! [ps:9184]

Message from syslogd@april at Oct 25 20:16:16 ...
 kernel:[ 5728.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
22s! [ps:9184]

Message from syslogd@april at Oct 25 20:16:52 ...
 kernel:[ 5764.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
23s! [ps:9184]


Message from syslogd@april at Oct 25 20:17:20 ...
 kernel:[ 5792.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
22s! [ps:9184]

Message from syslogd@april at Oct 25 20:17:56 ...
 kernel:[ 5828.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
22s! [ps:9184]

Message from syslogd@april at Oct 25 20:18:24 ...
 kernel:[ 5856.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
22s! [ps:9184]

Message from syslogd@april at Oct 25 20:18:56 ...
 kernel:[ 5888.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
22s! [ps:9184]

Message from syslogd@april at Oct 25 20:19:24 ...
 kernel:[ 5916.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
22s! [ps:9184]

Message from syslogd@april at Oct 25 20:20:00 ...
 kernel:[ 5952.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
22s! [ps:9184]

Message from syslogd@april at Oct 25 20:20:28 ...
 kernel:[ 5980.156005] NMI watchdog: BUG: soft lockup - CPU#1 stuck for
22s! [ps:9184]
___
Dng mailing list
Dng@lists.dyne.org
https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng