Bug#897123: linux-image-4.15.0-3-amd64: Diskless workstation regularly stalls with 4.15 kernel

2018-04-28 Thread BERTRAND Joël
A precision I have forgotten. I have filled a bug against kernel as
nfs's packages haven't been upgraded when this bug was triggered.

Only one workaround : use nolock mount option.

Best regards,

JKB



Bug#897123: linux-image-4.15.0-3-amd64: Diskless workstation regularly stalls with 4.15 kernel

2018-04-28 Thread BERTRAND Joël
Package: src:linux
Version: 4.15.17-1
Severity: important

Dear Maintainer,

I use some diskless workstations for a long time without any specific trouble.
Since I have installed 4.15 kernel from debian testing, my main workstation
regularly stalls. In syslog, I have a lot of :

Apr 28 18:51:53 hilbert kernel: [287335.545559] xs_tcp_setup_socket: connect 
returned unhandled error -107
Apr 28 18:51:55 hilbert kernel: [287337.057907] lockd: server 192.168.10.128 
not responding, still trying
Apr 28 18:52:07 hilbert kernel: [287349.418038] lockd: server 192.168.10.128 OK
Apr 28 18:52:16 hilbert kernel: [287358.042284] lockd: server 192.168.10.128 OK
Apr 28 18:52:35 hilbert kernel: [287377.715081] lockd: server 192.168.10.128 
not responding, still trying
Apr 28 18:53:10 hilbert kernel: [287412.071680] lockd: server 192.168.10.128 OK
Apr 28 18:53:10 hilbert kernel: [287412.072447] lockd: server 192.168.10.128 
not responding, still trying
Apr 28 18:53:10 hilbert kernel: [287412.323525] lockd: server 192.168.10.128 
not responding, still trying
Apr 28 18:53:33 hilbert kernel: [287435.567724] lockd: server 192.168.10.128 
not responding, still trying
Apr 28 18:53:37 hilbert kernel: [287439.864192] lockd: server 192.168.10.128 
not responding, still trying
Apr 28 18:54:03 hilbert kernel: [287465.336821] lockd: server 192.168.10.128 OK
Apr 28 18:54:09 hilbert kernel: [287471.648970] lockd: server 192.168.10.128 OK

nfsstats returns :
hilbert:[~] > nfsstat -m
/ from 192.168.10.128:/srv/hilbert
 Flags: 
rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nolock,proto=tcp,port=2049,timeo=7,retrans=10,sec=sys,local_lock=all,addr=192.168.10.128

/home from 192.168.10.128:/home
 Flags: 
rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.10.128,mountvers=3,mountport=1020,mountproto=tcp,local_lock=none,addr=192.168.10.128

Of course, server's configuration has not changed. This server runs NetBSD 8 and
I haven't seen any NFS malfunction with kernel 4.14. Of course, all other
clients running Linux (arm kernel 4.9, x86 kernel 4.14 or FreeBSD 11.1) run as
expected without any trouble.

Best regards,

JKB

-- Package-specific info:
** Version:
Linux version 4.15.0-3-amd64 (debian-kernel@lists.debian.org) (gcc version 
7.3.0 (Debian 7.3.0-16)) #1 SMP Debian 4.15.17-1 (2018-04-19)

** Command line:
BOOT_IMAGE=pxelinux.cfg/vmlinuz-4.15.0-3-amd64-hilbert root=/dev/nfs 
initrd=pxelinux.cfg/initrd.img-4.15.0-3-amd64-hilbert 
nfsroot=192.168.10.128:/srv/hilbert ip=dhcp rw

** Tainted: O (4096)
 * Out-of-tree module has been loaded.

** Kernel log:
[285673.025753] xs_tcp_setup_socket: connect returned unhandled error -107
[285857.121396] lockd: server 192.168.10.128 not responding, still trying
[285857.373345] lockd: server 192.168.10.128 not responding, still trying
[285858.641114] lockd: server 192.168.10.128 not responding, still trying
[285866.965630] lockd: server 192.168.10.128 not responding, still trying
[285868.225678] lockd: server 192.168.10.128 not responding, still trying
[285938.011610] lockd: server 192.168.10.128 OK
[285939.019589] lockd: server 192.168.10.128 OK
[285947.627997] lockd: server 192.168.10.128 OK
[285949.139779] lockd: server 192.168.10.128 OK
[285956.528084] lockd: server 192.168.10.128 OK
[286860.226436] xs_tcp_setup_socket: connect returned unhandled error -107
[286958.020204] lockd: server 192.168.10.128 not responding, still trying
[286961.808706] lockd: server 192.168.10.128 not responding, still trying
[286971.148954] lockd: server 192.168.10.128 OK
[286994.873491] lockd: server 192.168.10.128 not responding, still trying
[287048.118707] lockd: server 192.168.10.128 not responding, still trying
[287082.463590] lockd: server 192.168.10.128 not responding, still trying
[287116.776538] lockd: server 192.168.10.128 OK
[287117.784429] lockd: server 192.168.10.128 OK
[287118.820500] lockd: server 192.168.10.128 OK
[287126.152638] lockd: server 192.168.10.128 OK
[287250.911579] lockd: server 192.168.10.128 not responding, still trying
[287251.919620] lockd: server 192.168.10.128 not responding, still trying
[287264.540008] lockd: server 192.168.10.128 OK
[287264.540141] xs_tcp_setup_socket: connect returned unhandled error -107
[287275.144907] lockd: server 192.168.10.128 OK
[287335.545548] lockd: server 192.168.10.128 not responding, still trying
[287335.545559] xs_tcp_setup_socket: connect returned unhandled error -107
[287337.057907] lockd: server 192.168.10.128 not responding, still trying
[287349.418038] lockd: server 192.168.10.128 OK
[287358.042284] lockd: server 192.168.10.128 OK
[287377.715081] lockd: server 192.168.10.128 not responding, still trying
[287408.271051] lockd: server 192.168.10.128 not responding, still trying
[287412.071680] lockd: server 192.168.10.128 OK
[287412.072447] lockd: server 192.168.10.128 not responding, still trying
[287412.323525] lockd: server 192.168.10.128 not responding, still trying

Re: sparc / Problems with 3.2-kernel

2013-06-03 Thread BERTRAND Joël

Artyom Tarasenko a écrit :

On Sun, Jun 2, 2013 at 11:38 PM, BERTRAND Joël
joel.bertr...@systella.fr  wrote:

Andreas Barth a écrit :


Hi,

today I tried to resurrect our buildd on schroeder (which is from
architecture sparc). While trying to do so, I had a couple of strange
behaviours like vi freezing after 20-30 seconds, or I couldn't about
tail with ctrl+c or suspend with ctrl+z.  This didn't change after a
reboot. Machine was running our official 3.2-kernel.


Was the machine available via the network at the time of freeze?


After Martin rebooted the machine into oldstables 2.6.32 kernel,
things are working as they should, and the buildd is up again.

Question: Is this a known issue? How can we get a fix for that?  (I
would assume DSA will consider it a non-option to not be able to
upgrade to our default kernels.)



 I have seen same freezes on Blade 2000 (2 * US III+/900 Cu), on U5
(US IIi/440), but not on a server that run with four US IIIi and not on a
T1000 (sun4v/US T1). I have tested 3.2 debian kernels and official ones and
both hang. I suppose there is another strange bug with special MMU. 2.6.32
is stable but on sparc64, 3.2 is not. I have not tested newer kernels.


Bertrand, Andreas,

can you please enable the magic SysRq key and try to see what is the
kernel busy with when it hangs?


	When freeze occurs, workstation was totaly blocked. SysRq didn't work 
anymore. I have reinstalled my Blade 2000 with NetBSD but I can try to 
obtain more information in a couple of weeks. My U5 runs with jessie 
(and a 2.6.32 kernel).


Regards,

JKB


--
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/51acd1b3.1020...@systella.fr



Re: sparc / Problems with 3.2-kernel

2013-06-02 Thread BERTRAND Joël

Andreas Barth a écrit :

Hi,

today I tried to resurrect our buildd on schroeder (which is from
architecture sparc). While trying to do so, I had a couple of strange
behaviours like vi freezing after 20-30 seconds, or I couldn't about
tail with ctrl+c or suspend with ctrl+z.  This didn't change after a
reboot. Machine was running our official 3.2-kernel.

After Martin rebooted the machine into oldstables 2.6.32 kernel,
things are working as they should, and the buildd is up again.

Question: Is this a known issue? How can we get a fix for that?  (I
would assume DSA will consider it a non-option to not be able to
upgrade to our default kernels.)


Hello,

	I have seen same freezes on Blade 2000 (2 * US III+/900 Cu), on U5 (US 
IIi/440), but not on a server that run with four US IIIi and not on a 
T1000 (sun4v/US T1). I have tested 3.2 debian kernels and official ones 
and both hang. I suppose there is another strange bug with special MMU. 
2.6.32 is stable but on sparc64, 3.2 is not. I have not tested newer 
kernels.


Regards,

JKB


--
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/51abbb4f.5010...@systella.fr



Bug#400372: dpkg randomly craches on Sparc32 running HyperSPARC processor

2006-12-03 Thread BERTRAND Joël

Jurij Smakov a écrit :

Hi Joel,


Hello Jurij,

Sorry, I cannot reproduce the behavior you are describing. I have a 
SS20 box, running sid with Debian's linux-image-2.6.18-3-sparc32 
(version 2.6.18-6) kernel:


	I don't try with debian package, only with official linux kernel. I 
have tested the 2.6.19 but sunlance doesn't work for me in a SS20 that 
runs with two sunlance (eth0/1)and one hme (eth2) interfaces... I don't 
know the difference between the debian and official kernels.



debian:~# uname -a
Linux debian 2.6.18-3-sparc32 #1 Fri Nov 24 16:10:16 GMT 2006 sparc GNU/Linux
debian:~# cat /proc/version 
Linux version 2.6.18-3-sparc32 (Debian 2.6.18-6) ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)) #1 Fri Nov 
24 16:10:16 GMT 2006
debian:~# cat /proc/cpuinfo 
cpu : ROSS HyperSparc RT625 or RT626

fpu : ROSS HyperSparc combined IU/FPU
promlib : Version 3 Revision 2
prom: 2.25
type: sun4m
ncpus probed: 2
ncpus active: 1
CPU0Bogo: 133.12
CPU0ClkTck  : 13300
MMU type: ROSS HyperSparc
contexts: 4096
nocache total   : 2252800
nocache used: 492800

To stress-test dpkg I've just ran 'apt-get install kde' on an unstable 
system. That a pretty big update:


Is your kernel SMP ? Do you use HIGHMEM ?


[..]
0 upgraded, 400 newly installed, 0 to remove and 2 not upgraded.
Need to get 242MB of archives.
After unpacking 678MB of additional disk space will be used.

and it completed without any problems. I also do not recall having any 
problems with dpkg recently. 


What OBP version do you have in these machines?


	I have a 2.25 from Sun in one, a 2.25R from ROSS in another one and a 
2.25W (?) from an HyperSTATION in the third one. Tested modules are 
single and dual RT-626 with a VSIMM (4 and 8MB) and 448 MB. I think it 
is not an hardware trouble because all configurations I have tried 
return exactly the same error. Hardware of the main station I use for 
tests are validated with Solaris9 and without any trouble during several 
days (but I cannot use three or four CPU with Solaris9 without having 
Watchdog reset!, all combinaisons with more than two CPU are not 
stable. Same results on the three stations. If I have time, I shall try 
to install a Solaris 2.7...).


ISTR that to run 
latest HyperSPARC CPUs from Ross you need either 2.25 from Sun, or 
2.25R from Ross. If you are running anything lower, I suggest you try 
to upgrade your firmware, see http://www.sunshack.org/data/bootroms.html

for details.


	Maybe this trouble is related to the trouble I see with two 
SuperSPARC's (Oops in readv_pipe). I don't know...


Regards,

JKB



Bug#363344: initramfs-tools and HyperSPARC processor

2006-04-18 Thread BERTRAND Joël

Package: initramfs-tools
Version: 0.59b
Severity: grave
Arch: sparc

It is impossible to build a ramfs image on a HyperSPARC 
workstation. I have try to install etch/sparc on a SS20 that runs with 
four HyperSPARC processors. When it boots, system hangs when it tries to 
load the ramfs image. If I replace HyperSPARC by SuperSPARC-II, I can 
restart the workstation without any trouble. I have tested some 
configurations and some differents sparc32 CPU's. The only combinaison 
that fails is ramfs created by initramfs-tools with HyperSPARC.


Regards,

JKB


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#363344: initramfs-tools and HyperSPARC processor

2006-04-18 Thread BERTRAND Joël

maximilian attems a écrit :

severity 363344 important
tags 363344 moreinfo
thanks

cher bertrand,

On Tue, Apr 18, 2006 at 06:04:08PM +0200, BERTRAND Joël wrote:


Package: initramfs-tools
Version: 0.59b
Severity: grave
Arch: sparc



hmm it may be serious, but for now i go for important.
anyway you omitted lots of information.


Yes, I know ;-)

   It is impossible to build a ramfs image on a HyperSPARC 
workstation.


	I have done a mistake... I cannot use a ramfs image (I don't make ramfs 
images to boot my SS20).



please post the error when trying to build the initramfs?


	I cannot post any error because HyperSPARC is not stable enough with 
2.6 kernel. I only use kernels from debian repository or kernel in deb 
package I use for tests : 
http://www.wooyd.org/debian/kernels/linux-image-2.6.16-1-sparc32_2.6.16-6_sparc.deb


I have try to install etch/sparc on a SS20 that runs with 
four HyperSPARC processors. When it boots, system hangs when it tries to 
load the ramfs image.



please post the relevant messages.
are you dropped into a console?


	No. Only a panic message with bad magic number when kernel tries to 
mount ramfs image.


are your devices created? 
If I replace HyperSPARC by SuperSPARC-II, I can 
restart the workstation without any trouble. I have tested some 
configurations and some differents sparc32 CPU's. The only combinaison 
that fails is ramfs created by initramfs-tools with HyperSPARC.


trave11er notified me of some sparc trouble,
but from aboves bug report i have no idea where they could come from.


	For these tests, I have installed two SuperSPARC-II in place of ROSS 
modules. I use a patched kernel (ESP and SMP patches):


hilbert:~# uname -a
Linux hilbert 2.6.17-rc1-patch-smp #4 SMP Tue Apr 18 16:34:12 CEST 2006 
sparc GNU/Linux

hilbert:~# cat /proc/cpuinfo
cpu : Texas Instruments, Inc. - SuperSparc-(II)
fpu : SuperSparc on-chip FPU
promlib : Version 3 Revision 2
prom: 2.25
type: sun4m
ncpus probed: 2
ncpus active: 2
Cpu0Bogo: 75.16
Cpu2Bogo: 75.16
MMU type: TI Viking/MXCC
contexts: 65536
nocache total   : 5242880
nocache used: 1277440
State:
CPU0: online
CPU2: online
hilbert:~#


does fstype work?


No, it does not... fstype is a sparcv9 executable !

hilbert:~# file /usr/lib/klibc/bin/fstype
/usr/lib/klibc/bin/fstype: ELF 64-bit MSB executable, SPARC V9, version 
1 (SYSV), statically linked (uses shared libs), stripped


	Even I kindly ask my SuperSPARC's, I think they cannot understand what 
they have to do...


	When I have installed initramfs-tools, I have installed the following 
packagesl:


- busybox (1.01-4)
- libklibc (1.2.4-1)
- klibc-utils (1.2.4-1)
- libvolume-id0 (0.089-1)
- udev (0.089-1)
- initramfs-tools (0.59b)


please post output of:
/usr/lib/klibc/bin/fstype  /dev/sda1 # please use your real root
cat /proc/cmdline


hilbert:~# cat /proc/cmdline
root=/dev/md1 ro md=1,/dev/sdb4,/dev/sda4


lsmod


hilbert:~# lsmod
Module  Size  Used by
sg 33216  4
sr_mod 16680  0
cdrom  4  1 sr_mod
sunlance   15024  0
openprom8560  0
openpromfs 17840  1
softdog 6644  0
hilbert:~#

Don't forget kernel panics before it starts init.


merci beacoup pour vos explications.
sinon il m'est impossible de savoir ce qui ce passe.

sincèrement


Amicalement,

JKB