from:"Anton Ivanov"

Bug#1004392: systemd: Incorrect location of configuration files

2022-01-26 Thread Anton Ivanov

Package: systemd
Version: 247.3-6
Severity: serious
Justification: Policy 10.7

Dear Maintainer,

/usr/lib/tmpfiles.d/x11.conf should be a configuration file. Entries in it must 
be disabled in order to run containers with accelerated X11 and DRI access. 

As it is under lib, changes to it are overwritten on every systemd update 
breaking all containers which run X apps with direct access to local X-server.

1. There is no way to disable it permanently.
2. There is no way to override it in a way which disables the defaults 

Actually, most of that directory does not belong in /usr - it should be under 
/etc as per Debian policy for configuration files and should be handled as 
config on
system upgrades and updates.

-- Package-specific info:

-- System Information:
Debian Release: 11.2
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-10-amd64 (SMP w/8 CPU threads)
Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_GB:en
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages systemd depends on:
ii  adduser3.118
ii  libacl12.2.53-10
ii  libapparmor1   2.13.6-10
ii  libaudit1  1:3.0-2
ii  libblkid1  2.36.1-8
ii  libc6  2.31-13+deb11u2
ii  libcap21:2.44-1
ii  libcrypt1  1:4.4.18-4
ii  libcryptsetup122:2.3.5-1
ii  libgcrypt201.8.7-6
ii  libgnutls303.7.1-5
ii  libgpg-error0  1.38-2
ii  libip4tc2  1.8.7-1
ii  libkmod2   28-1
ii  liblz4-1   1.9.3-2
ii  liblzma5   5.2.5-2
ii  libmount1  2.36.1-8
ii  libpam0g   1.4.0-9+deb11u1
ii  libseccomp22.5.1-1+deb11u1
ii  libselinux13.1-3
ii  libsystemd0247.3-6
ii  libzstd1   1.4.8+dfsg-2.1
ii  mount  2.36.1-8
ii  ntp [time-daemon]  1:4.2.8p15+dfsg-1
ii  util-linux 2.36.1-8

Versions of packages systemd recommends:
ii  dbus  1.12.20-2

Versions of packages systemd suggests:
ii  policykit-10.105-31
pn  systemd-container  

Versions of packages systemd is related to:
pn  dracut   
ii  initramfs-tools  0.140
ii  libnss-systemd   247.3-6
ii  libpam-systemd   247.3-6
ii  udev 247.3-6

-- Configuration Files:
/etc/systemd/logind.conf changed:
[Login]
KillUserProcesses=yes
KillExcludeUsers=root


-- no debconf information

Bug#989571: linux-image-5.10.0-0.bpo.3-amd64: Incorrect large USB disk sizing leading to data corruption

2021-06-07 Thread Anton Ivanov

Package: src:linux
Version: 5.10.13-1~bpo10+1
Severity: critical
Justification: causes serious data loss

Dear Maintainer,

Large USB drives (example - Seagate 4TB Backup) which work perfectly fine with 
4.19 are identified as incorrect size. In the case of the 4TB sized USB it's 
identified as a 17GB and for some unfatomable reason mounted as loop. The 
result is severe data corruption making all 4TB of data on the drive 
unrecoverable.

Tested with the original USB bridge coming with the drive and after attaching 
the SATA drive inside to an alternative USB bridge. Same result in both cases.

-- Package-specific info:
** Version:
Linux version 5.10.0-0.bpo.3-amd64 (debian-ker...@lists.debian.org) (gcc-8 
(Debian 8.3.0-6) 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #1 SMP Debian 
5.10.13-1~bpo10+1 (2021-02-11)

** Command line:
BOOT_IMAGE=diskless/amd64/vmlinuz-5.10.0-0.bpo.3-amd64 
initrd=diskless/amd64/initrd.img-5.10.0-0.bpo.3-amd64 root=/dev/nfs ip=dhcp 
nfsroot=192.168.3.3:/exports/boot/madding mitigations=off rw  --

** Tainted: S (4)
 * SMP kernel oops on an officially SMP incapable processor

** Kernel log:
[754632.929276] nfs: server 192.168.3.3 OK
[754635.600887] rpc_check_timeout: 443 callbacks suppressed
[754635.600889] nfs: server 192.168.3.3 not responding, still trying
[754635.612996] nfs: server 192.168.3.3 not responding, still trying
[754635.625266] nfs: server 192.168.3.3 not responding, still trying
[754635.625462] nfs: server 192.168.3.3 not responding, still trying
[754635.637374] nfs: server 192.168.3.3 not responding, still trying
[754635.649472] nfs: server 192.168.3.3 not responding, still trying
[754635.661739] nfs: server 192.168.3.3 not responding, still trying
[754635.661922] nfs: server 192.168.3.3 not responding, still trying
[754635.673850] nfs: server 192.168.3.3 not responding, still trying
[754635.686131] nfs: server 192.168.3.3 not responding, still trying
[791938.374623] lxc-bridge0: port 3(tap-opsft2-0) entered blocking state
[791938.374628] lxc-bridge0: port 3(tap-opsft2-0) entered forwarding state
[791938.374654] lxc-bridge0: port 4(tap-opsft3-0) entered blocking state
[791938.374655] lxc-bridge0: port 4(tap-opsft3-0) entered forwarding state
[791938.375075] lxc-bridge0: port 2(tap-opsft1-0) entered blocking state
[791938.375078] lxc-bridge0: port 2(tap-opsft1-0) entered forwarding state
[791938.388241] k8-bridge0: port 2(tap-opsft1-1) entered blocking state
[791938.388243] k8-bridge0: port 2(tap-opsft1-1) entered forwarding state
[791938.388402] k8-bridge0: port 4(tap-opsft3-1) entered blocking state
[791938.388405] k8-bridge0: port 4(tap-opsft3-1) entered forwarding state
[791938.388481] k8-bridge0: port 3(tap-opsft2-1) entered blocking state
[791938.388484] k8-bridge0: port 3(tap-opsft2-1) entered forwarding state
[801076.265404] usb 4-2.4: new SuperSpeed Gen 1 USB device number 5 using 
xhci_hcd
[801076.289933] usb 4-2.4: New USB device found, idVendor=174c, idProduct=55aa, 
bcdDevice= 1.00
[801076.289937] usb 4-2.4: New USB device strings: Mfr=2, Product=3, 
SerialNumber=1
[801076.289939] usb 4-2.4: Product: ASM105x
[801076.289940] usb 4-2.4: Manufacturer: ASMT
[801076.289942] usb 4-2.4: SerialNumber: 
[801076.291139] scsi host10: uas
[801076.291557] scsi 10:0:0:0: Direct-Access ASMT 2115 0
PQ: 0 ANSI: 6
[801076.292065] sd 10:0:0:0: Attached scsi generic sg0 type 0
[801076.292232] sd 10:0:0:0: [sda] Spinning up disk...
[801077.321342] ..ready
[801082.447597] sd 10:0:0:0: [sda] 7814037168 512-byte logical blocks: (4.00 
TB/3.64 TiB)
[801082.447600] sd 10:0:0:0: [sda] 4096-byte physical blocks
[801082.447673] sd 10:0:0:0: [sda] Write Protect is off
[801082.447674] sd 10:0:0:0: [sda] Mode Sense: 43 00 00 00
[801082.447832] sd 10:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
doesn't support DPO or FUA
[801082.448032] sd 10:0:0:0: [sda] Optimal transfer size 33553920 bytes not a 
multiple of physical block size (4096 bytes)
[801082.494646] sd 10:0:0:0: [sda] Attached SCSI disk
[801150.687429] loop: module loaded
[801150.815997] EXT4-fs (loop0): mounted filesystem with ordered data mode. 
Opts: (null)
[803002.579925] blk_update_request: I/O error, dev loop0, sector 0 op 
0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[803002.579960] blk_update_request: I/O error, dev loop0, sector 0 op 
0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[803017.725341] EXT4-fs (loop0): mounted filesystem with ordered data mode. 
Opts: (null)
[803081.125594] blk_update_request: I/O error, dev loop0, sector 0 op 
0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[803081.125635] blk_update_request: I/O error, dev loop0, sector 0 op 
0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[803085.522063] EXT4-fs (loop0): mounted filesystem with ordered data mode. 
Opts: (null)
[803239.336895] blk_update_request: I/O error, dev loop0, sector 0 op 
0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
[803239.336950] blk_update_request: I/O

Bug#983379: [PATCH] um: mark all kernel symbols as local

2021-03-05 Thread Anton Ivanov


On 05/03/2021 20:43, Johannes Berg wrote:

From: Johannes Berg 

Ritesh reported a bug [1] against UML, noting that it crashed on
startup. The backtrace shows the following (heavily redacted):

(gdb) bt
...
  #26 0x60015b5d in sem_init () at ipc/sem.c:268
  #27 0x7f89906d92f7 in ?? () from /lib/x86_64-linux-gnu/libcom_err.so.2
  #28 0x7f8990ab8fb2 in call_init (...) at dl-init.c:72
...
  #40 0x7f89909bf3a6 in nss_load_library (...) at nsswitch.c:359
...
  #44 0x7f8990895e35 in _nss_compat_getgrnam_r (...) at 
nss_compat/compat-grp.c:486
  #45 0x7f8990968b85 in __getgrnam_r [...]
  #46 0x7f89909d6b77 in grantpt [...]
  #47 0x7f8990a9394e in __GI_openpty [...]
  #48 0x604a1f65 in openpty_cb (...) at arch/um/os-Linux/sigio.c:407
  #49 0x604a58d0 in start_idle_thread (...) at 
arch/um/os-Linux/skas/process.c:598
  #50 0x60004a3d in start_uml () at arch/um/kernel/skas/process.c:45
  #51 0x600047b2 in linux_main (...) at arch/um/kernel/um_arch.c:334
  #52 0x6000574f in main (...) at arch/um/os-Linux/main.c:144

indicating that the UML function openpty_cb() calls openpty(),
which internally calls __getgrnam_r(), which causes the nsswitch
machinery to get started.

This loads, through lots of indirection that I snipped, the
libcom_err.so.2 library, which (in an unknown function, "??")
calls sem_init().

Now, of course it wants to get libpthread's sem_init(), since
it's linked against libpthread. However, the dynamic linker
looks up that symbol against the binary first, and gets the
kernel's sem_init().

Hajime Tazaki noted that "objcopy -L" can localize a symbol,
so the dynamic linker wouldn't do the lookup this way. I tried,
but for some reason that didn't seem to work.

Doing the same thing in the linker script instead does seem to
work, though I cannot entirely explain - it *also* works if I
just add "VERSION { { global: *; }; }" instead, indicating that
something else is happening that I don't really understand. It
may be that explicitly doing that marks them with some kind of
empty version, and that's different from the default.

Explicitly marking them with a version breaks kallsyms, so that
doesn't seem to be possible.

Marking all the symbols as local seems correct, and does seem
to address the issue, so do that. Also do it for static link,
nsswitch libraries could still be loaded there.

[1] https://bugs.debian.org/983379

Reported-by: Ritesh Raj Sarraf 
Signed-off-by: Johannes Berg 
---
  arch/um/kernel/dyn.lds.S | 6 ++
  arch/um/kernel/uml.lds.S | 6 ++
  2 files changed, 12 insertions(+)

diff --git a/arch/um/kernel/dyn.lds.S b/arch/um/kernel/dyn.lds.S
index dacbfabf66d8..2f2a8ce92f1e 100644
--- a/arch/um/kernel/dyn.lds.S
+++ b/arch/um/kernel/dyn.lds.S
@@ -6,6 +6,12 @@ OUTPUT_ARCH(ELF_ARCH)
  ENTRY(_start)
  jiffies = jiffies_64;
  
+VERSION {

+  {
+local: *;
+  };
+}
+
  SECTIONS
  {
PROVIDE (__executable_start = START);
diff --git a/arch/um/kernel/uml.lds.S b/arch/um/kernel/uml.lds.S
index 45d957d7004c..7a8e2b123e29 100644
--- a/arch/um/kernel/uml.lds.S
+++ b/arch/um/kernel/uml.lds.S
@@ -7,6 +7,12 @@ OUTPUT_ARCH(ELF_ARCH)
  ENTRY(_start)
  jiffies = jiffies_64;
  
+VERSION {

+  {
+local: *;
+  };
+}
+
  SECTIONS
  {
/* This must contain the right address - not quite the default ELF one.*/



Acked-By: Anton Ivanov 
--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/

Bug#983379: linux uml segfault

2021-03-05 Thread Anton Ivanov





On 05/03/2021 18:32, Johannes Berg wrote:



On 5 March 2021 18:39:42 CET, Anton Ivanov  
wrote:



On 04/03/2021 07:47, Johannes Berg wrote:

On Thu, 2021-03-04 at 14:38 +0900, Hajime Tazaki wrote:


Now, I don't know how to fix it (short of changing your nsswitch
configuration) - maybe we could somehow rename sem_init()? Or maybe

we

can somehow give the kernel binary a lower symbol resolution than

the

libc/libpthread.


objcopy (from binutils) can localize symbols (i.e., objcopy -L
sem_init $orig_file $new_file).  It also does renaming symbols.  But
not sure this is the ideal solution.


Yes, we started thinking about it but it was too late at night when I
replied ...

I think there's basically a way to have an external list of symbols

to

export, for symbol versioning, that we could/should use to basically

not

export any of the kernel symbols out to libs.


How does UML handle symbol conflicts between userspace code and

Linux

kernel (like this case sem_init) ?  AFAIK, libnl has a same symbol

as

Linux kernel (genlmsg_put) and others can possibly do as well.


I fear it doesn't?


Let's assume it does not, and try to fix this by de-conflicting the
symbol.
For the time being, also, let's aim for a Debian specific patch just to
go into their "patches" dir for build so that UML is not dropped out of
the release.

This should make all internal uses of sem_init be um_sem_init in the
actual object files. I will chase the issue of it picking up glibc
memcpy separately.
Upon close inspection it looks like a different issue - it is in the
other direction (picking a dynamic symbol instead of the one from the
tree). I spent all day chasing it today and I cannot reproduce it. At
the same time it was reproducible yesterday without any problems :(



+#ifdef CONFIG_UML
+void __init um_sem_init(void)
+#else
  void __init sem_init(void)
+#endif


Might be easier to just

#define sem_init um_sem_init

in an appropriate header file, perhaps even in arch/um/?


I thought of that, but surrendered to the "dark side" of the quick and ugly fix.

We can do that for the ipc/sem.c - it brings in uaccess.h which ultimately 
pulls uaccess from our asm tree. So if we do it there, it will end up in sem.c

However, that function is also referenced and is invoked out of ipc/util.c 
which does not pull that include.

I am going to dig through the rest of our includes to see if we can find a suitable one 
which will be picked up by both sem.c and util.c. I hope there is a place which we can 
use for a "proper" fix.

By the way, I actually remember seeing a couple of includes like that somewhere 
dealing with other um symbol conflicts, just can't remember where I saw it.




johannes



--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#983379: linux uml segfault

2021-03-05 Thread Anton Ivanov





On 04/03/2021 07:47, Johannes Berg wrote:

On Thu, 2021-03-04 at 14:38 +0900, Hajime Tazaki wrote:


Now, I don't know how to fix it (short of changing your nsswitch
configuration) - maybe we could somehow rename sem_init()? Or maybe we
can somehow give the kernel binary a lower symbol resolution than the
libc/libpthread.


objcopy (from binutils) can localize symbols (i.e., objcopy -L
sem_init $orig_file $new_file).  It also does renaming symbols.  But
not sure this is the ideal solution.


Yes, we started thinking about it but it was too late at night when I
replied ...

I think there's basically a way to have an external list of symbols to
export, for symbol versioning, that we could/should use to basically not
export any of the kernel symbols out to libs.


How does UML handle symbol conflicts between userspace code and Linux
kernel (like this case sem_init) ?  AFAIK, libnl has a same symbol as
Linux kernel (genlmsg_put) and others can possibly do as well.


I fear it doesn't?


Let's assume it does not, and try to fix this by de-conflicting the symbol.
For the time being, also, let's aim for a Debian specific patch just to go into their 
"patches" dir for build so that UML is not dropped out of the release.

This should make all internal uses of sem_init be um_sem_init in the actual 
object files. I will chase the issue of it picking up glibc memcpy separately.
Upon close inspection it looks like a different issue - it is in the other 
direction (picking a dynamic symbol instead of the one from the tree). I spent 
all day chasing it today and I cannot reproduce it. At the same time it was 
reproducible yesterday without any problems :(

Ritesh, can you give the following a spin - it renames sem_init as um_sem_init 
for UML only?

diff --git a/ipc/sem.c b/ipc/sem.c
index f6c30a85dadf..5157796daf54 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -263,7 +263,11 @@ void sem_exit_ns(struct ipc_namespace *ns)
 }
 #endif

+#ifdef CONFIG_UML
+void __init um_sem_init(void)
+#else
 void __init sem_init(void)
+#endif
 {
sem_init_ns(_ipc_ns);
ipc_init_proc_interface("sysvipc/sem",
diff --git a/ipc/util.h b/ipc/util.h
index 5766c61aed0e..b3356efb3c96 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -47,7 +47,12 @@ extern int ipc_min_cycle;
 #define IPCMNI_IDX_MASK((1 << IPCMNI_SHIFT) - 1)
 #endif /* CONFIG_SYSVIPC_SYSCTL */

+#ifdef CONFIG_UML
+void um_sem_init(void);
+#define sem_init() um_sem_init()
+#else
 void sem_init(void);
+#endif
 void msg_init(void);
 void shm_init(void);





johannes




--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#983379: linux uml segfault

2021-03-05 Thread Anton Ivanov




On 04/03/2021 18:41, Anton Ivanov wrote:



On 04/03/2021 08:05, Benjamin Berg wrote:

On Thu, 2021-03-04 at 08:47 +0100, Johannes Berg wrote:
On Thu, 2021-03-04 at 14:38 +0900, Hajime Tazaki wrote:


Now, I don't know how to fix it (short of changing your nsswitch
configuration) - maybe we could somehow rename sem_init()? Or maybe
we
can somehow give the kernel binary a lower symbol resolution than
the
libc/libpthread.


objcopy (from binutils) can localize symbols (i.e., objcopy -L
sem_init $orig_file $new_file).  It also does renaming symbols.  But
not sure this is the ideal solution.


Yes, we started thinking about it but it was too late at night when I
replied ...

I think there's basically a way to have an external list of symbols to
export, for symbol versioning, that we could/should use to basically
not
export any of the kernel symbols out to libs.

Maybe using the ld --version-script= option here works to mark all
kernel symbols as being "local" and prevent them from being picked up
by libraries.

Benjamin


How does UML handle symbol conflicts between userspace code and Linux
kernel (like this case sem_init) ?  AFAIK, libnl has a same symbol as
Linux kernel (genlmsg_put) and others can possibly do as well.


I fear it doesn't?


I can confirm that it did and this bug is bisect-able.

with 5.7

# dd if=/dev/ubda of=/dev/null bs=1M
16384+1 records in
16384+1 records out
17179869696 bytes (17 GB, 16 GiB) copied, 10.6973 s, 1.6 GB/s

with 5.10 the speed is 2.2
5.7 with "strings from glibc" patch speed is 2.2

As we did not do anything else in this timeframe to jack up the speed from 1.6GB/s to 
2.2GB/s and as it is identical to the speed you get with the "use glibc 
strings.h" this looks like a good criteria to bisect on.

I am going to do a bisect with 5.7 "good" and 5.10 "bad" using the speed test 
as a working hypothesis.


This is proving very "interesting" to try to chase down, because the "picking the 
wrong library" does not happen every time.

F.E. yesterday my 5.10 builds were picking glibc memcpy and friends. Today with 
the same config and everything else the same it is picking built-ins.

I need to finds some better way to reproduce this.

A.




A.




johannes


___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um



___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um




--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#983379: linux uml segfault

2021-03-04 Thread Anton Ivanov





On 04/03/2021 08:05, Benjamin Berg wrote:

On Thu, 2021-03-04 at 08:47 +0100, Johannes Berg wrote:
On Thu, 2021-03-04 at 14:38 +0900, Hajime Tazaki wrote:


Now, I don't know how to fix it (short of changing your nsswitch
configuration) - maybe we could somehow rename sem_init()? Or maybe
we
can somehow give the kernel binary a lower symbol resolution than
the
libc/libpthread.


objcopy (from binutils) can localize symbols (i.e., objcopy -L
sem_init $orig_file $new_file).  It also does renaming symbols.  But
not sure this is the ideal solution.


Yes, we started thinking about it but it was too late at night when I
replied ...

I think there's basically a way to have an external list of symbols to
export, for symbol versioning, that we could/should use to basically
not
export any of the kernel symbols out to libs.

Maybe using the ld --version-script= option here works to mark all
kernel symbols as being "local" and prevent them from being picked up
by libraries.

Benjamin


How does UML handle symbol conflicts between userspace code and Linux
kernel (like this case sem_init) ?  AFAIK, libnl has a same symbol as
Linux kernel (genlmsg_put) and others can possibly do as well.


I fear it doesn't?


I can confirm that it did and this bug is bisect-able.

with 5.7

# dd if=/dev/ubda of=/dev/null bs=1M
16384+1 records in
16384+1 records out
17179869696 bytes (17 GB, 16 GiB) copied, 10.6973 s, 1.6 GB/s

with 5.10 the speed is 2.2
5.7 with "strings from glibc" patch speed is 2.2

As we did not do anything else in this timeframe to jack up the speed from 1.6GB/s to 
2.2GB/s and as it is identical to the speed you get with the "use glibc 
strings.h" this looks like a good criteria to bisect on.

I am going to do a bisect with 5.7 "good" and 5.10 "bad" using the speed test 
as a working hypothesis.

A.




johannes


___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um



___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um



--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#983379: linux uml segfault

2021-03-03 Thread Anton Ivanov


On 04/03/2021 05:38, Hajime Tazaki wrote:


On Thu, 04 Mar 2021 07:40:00 +0900,
Johannes Berg wrote:


I think the problem is here:


#24 0x6080f234 in ipc_init_ids (ids=0x60c60de8 )
at ipc/util.c:119
#25 0x60813c6d in sem_init_ns (ns=0x60d895bb ) at
ipc/sem.c:254
#26 0x60015b5d in sem_init () at ipc/sem.c:268
#27 0x7f89906d92f7 in ?? () from /lib/x86_64-linux-
gnu/libcom_err.so.2


You're in the init of libcom_err.so.2, which is loaded by


"libnss_nis.so.2"


which is loaded by normal NSS code (getgrnam):


#40 0x7f89909bf3a6 in nss_load_library (ni=ni@entry=0x61497db0) at
nsswitch.c:359
#41 0x7f89909bfc39 in __GI___nss_lookup_function (ni=0x61497db0,
fct_name=, fct_name@entry=0x7f899089b020 "setgrent") at
nsswitch.c:467
#42 0x7f899089554b in init_nss_interface () at nss_compat/compat-
grp.c:83
#43 init_nss_interface () at nss_compat/compat-grp.c:79
#44 0x7f8990895e35 in _nss_compat_getgrnam_r (name=0x7f8990a2a1e0
"tty", grp=0x7ffe3e7a2910, buffer=0x7ffe3e7a24e0 "", buflen=1024,
errnop=0x7f899089eb00) at nss_compat/compat-grp.c:486
#45 0x7f8990968b85 in __getgrnam_r (name=name@entry=0x7f8990a2a1e0
"tty", resbuf=resbuf@entry=0x7ffe3e7a2910,
buffer=buffer@entry=0x7ffe3e7a24e0 "", buflen=1024,
result=result@entry=0x7ffe3e7a2908)
 at ../nss/getXXbyYY_r.c:315



You have a strange nsswitch configuration that causes all of this
(libnss_nis.so.2 -> libcom_err.so.2) to get loaded.

Now libcom_err.so.2 is trying to call sem_init(), and that gets ... tada
... Linux's sem_init() instead of libpthread's.

And then the crash.

Now, I don't know how to fix it (short of changing your nsswitch
configuration) - maybe we could somehow rename sem_init()? Or maybe we
can somehow give the kernel binary a lower symbol resolution than the
libc/libpthread.


objcopy (from binutils) can localize symbols (i.e., objcopy -L
sem_init $orig_file $new_file).  It also does renaming symbols.  But
not sure this is the ideal solution.

How does UML handle symbol conflicts between userspace code and Linux
kernel (like this case sem_init) ?  AFAIK, libnl has a same symbol as
Linux kernel (genlmsg_put) and others can possibly do as well.


It used to handle them. I do not think it does now - something broke and 
it's fairly recent.


I actually have something which confirms this.

I worked on a patch around 5.8-5.9 which would give the option to pick 
up libc equivalents for the functions from string.h and there was a 
clear performance difference of ~ 20%+ This is because UML has no means 
of optimizing them and picks up the worst case scenario x86 version.


I parked that for a while, because had to look at other stuff at work.

I restarted working on it after 5.10. My first observation was that 
despite not changing anything in the patches, the gain was no longer 
there. The performance was the same as if it picked up libc equivalents.


I can either try to reproduce the nss config which causes the sem_init 
issue or use my own libc patchset to try to dissect. The problem commit 
will be roughly around the time the performance difference from applying 
the "switch to libc" goes away.


Brgds,

A.



-- Hajime

___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um




--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#983379: linux uml segfault

2021-03-03 Thread Anton Ivanov


On 03/03/2021 22:40, Johannes Berg wrote:

I think the problem is here:


#24 0x6080f234 in ipc_init_ids (ids=0x60c60de8 )
at ipc/util.c:119
#25 0x60813c6d in sem_init_ns (ns=0x60d895bb ) at
ipc/sem.c:254
#26 0x60015b5d in sem_init () at ipc/sem.c:268
#27 0x7f89906d92f7 in ?? () from /lib/x86_64-linux-
gnu/libcom_err.so.2


You're in the init of libcom_err.so.2, which is loaded by


"libnss_nis.so.2"


which is loaded by normal NSS code (getgrnam):


#40 0x7f89909bf3a6 in nss_load_library (ni=ni@entry=0x61497db0) at
nsswitch.c:359
#41 0x7f89909bfc39 in __GI___nss_lookup_function (ni=0x61497db0,
fct_name=, fct_name@entry=0x7f899089b020 "setgrent") at
nsswitch.c:467
#42 0x7f899089554b in init_nss_interface () at nss_compat/compat-
grp.c:83
#43 init_nss_interface () at nss_compat/compat-grp.c:79
#44 0x7f8990895e35 in _nss_compat_getgrnam_r (name=0x7f8990a2a1e0
"tty", grp=0x7ffe3e7a2910, buffer=0x7ffe3e7a24e0 "", buflen=1024,
errnop=0x7f899089eb00) at nss_compat/compat-grp.c:486
#45 0x7f8990968b85 in __getgrnam_r (name=name@entry=0x7f8990a2a1e0
"tty", resbuf=resbuf@entry=0x7ffe3e7a2910,
buffer=buffer@entry=0x7ffe3e7a24e0 "", buflen=1024,
result=result@entry=0x7ffe3e7a2908)
 at ../nss/getXXbyYY_r.c:315



You have a strange nsswitch configuration that causes all of this
(libnss_nis.so.2 -> libcom_err.so.2) to get loaded.

Now libcom_err.so.2 is trying to call sem_init(), and that gets ... tada
... Linux's sem_init() instead of libpthread's.

And then the crash.

Now, I don't know how to fix it (short of changing your nsswitch
configuration) - maybe we could somehow rename sem_init()? Or maybe we
can somehow give the kernel binary a lower symbol resolution than the
libc/libpthread.


I have not looked in depth in how the linking process works, but it 
should have picked up the sem_init from the kernel library, not libc.


We are already supposed to do that regarding kernel vs libc string.h 
functions - memcpy, etc.


Though for all of them the libc does the same so invoking the wrong one 
does not kill you so this may have been broken for a while and we were 
simply not noticing it.





johannes





--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#983379: linux uml segfault

2021-03-03 Thread Anton Ivanov




On 03/03/2021 10:45, Ritesh Raj Sarraf wrote:

HI Anton,

On Wed, 2021-03-03 at 09:30 +, Anton Ivanov wrote:

OTOH, I have one more user (other than you) who's not been able to
reproduce the issue.


I will do a dissect the moment I figure out how to reproduce it.
I
will try to do some more experiments on that tomorrow.

I tried to alter the userspace a bit, but it makes no difference.

Out of curiosity, what are you running it on?


Bare-metal machines. 3 different machines, all Intel processors.
And it fails on all 3 of them.


Hmmm...

All mine are AMD. I can try to boot up an Intel later today with Bullseye to 
see if it makes a difference.


On the distribution side, all 3 of them run Debian Unstable, with Linux
5.10.13


The code here is:

static inline u32 printk_caller_id(void)
{
 return in_task() ? task_pid_nr(current) :
 0x8000 + raw_smp_processor_id();
}


That is something which should not bomb out unless we have memory
corruption or something along those lines - current being invalid.


Must be something different. Not all machines could have bad memory at
the same time.


I did not mean bad memory. I meant memory corruption as a result of race, 
buffer overrun or anything else like that.





--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#983379: linux uml segfault

2021-03-03 Thread Anton Ivanov





On 02/03/2021 17:27, Ritesh Raj Sarraf wrote:

On Tue, 2021-03-02 at 17:05 +, Anton Ivanov wrote:

So the best I can extract for you is to compile the kernel with as
much
information as possible.


Can you try using one of the older kernels so we can verify if this
is indeed a 5.10 thing.



That was the first thing I tried. I tested it with 5.10, 5.9 and 5.4.
All 3 crashed. That's when I knew this one was going to be painful one
to conclude.

The only other input I have is that I have one more user who's reported
to be able to reproduce the issue.

OTOH, I have one more user (other than you) who's not been able to
reproduce the issue.


I will do a dissect the moment I figure out how to reproduce it. I
will try to do some more experiments on that tomorrow.


I tried to alter the userspace a bit, but it makes no difference.

Out of curiosity, what are you running it on?




Meanwhile, I enabled some debug info in the kernel. Here's what I have
got so far:

```
(gdb) bt
#0  0x7f89908dc087 in kill () at ../sysdeps/unix/syscall-
template.S:120
#1  0x604a3514 in uml_abort () at arch/um/os-Linux/util.c:94
#2  0x604a3791 in os_dump_core () at arch/um/os-
Linux/util.c:149
#3  0x6048d126 in panic_exit (self=0x2e66d5, unused1=6,
unused2=0x0) at arch/um/kernel/um_arch.c:217
#4  0x604c725a in notifier_call_chain (nl=0x2e66d5, val=0,
v=0x60d82f40 , nr_to_call=-1, nr_calls=0x0) at
kernel/notifier.c:83
#5  0x604c72f6 in atomic_notifier_call_chain (nh=0x2e66d5,
val=6, v=0x0) at kernel/notifier.c:217
#6  0x60a54607 in panic (fmt=0x60a55225 
"UH\211\345H\201\354", ) at
kernel/panic.c:272
#7  0x6048cca3 in segv (fi=, ip=1615717312,
is_user=0, regs=0x60c2ee58 ) at
arch/um/kernel/trap.c:246
#8  0x6048ce64 in segv_handler (sig=3040981, unused_si=0x6,
regs=0x60c2ee58 ) at arch/um/kernel/trap.c:190
#9  0x604a2556 in sig_handler_common (sig=11, si=0x60c2fbf0
, mc=0x60c2fae8 ) at
arch/um/os-Linux/signal.c:48
#10 0x604a2aa2 in sig_handler (sig=3040981, si=0x6, mc=0x0) at
arch/um/os-Linux/signal.c:81
#11 0x604a265f in hard_handler (sig=3040981, si=0x60c2fbf0
, p=0x0) at arch/um/os-Linux/signal.c:180
#12 


The code here is:

static inline u32 printk_caller_id(void)
{
return in_task() ? task_pid_nr(current) :
0x8000 + raw_smp_processor_id();
}


That is something which should not bomb out unless we have memory corruption or 
something along those lines - current being invalid.

A.


#13 0x604de3c0 in printk_caller_id () at
kernel/printk/printk.c:1924
#14 log_output (text_len=, text=,
dev_info=, lflags=, level=, facility=) at kernel/printk/printk.c:1932
#15 vprintk_store (facility=1624806843, level=5, dev_info=0x0, fmt=0x35
, args=0x1) at
kernel/printk/printk.c:2004
#16 0x604de8b7 in vprintk_emit (facility=1624806843,
level=1622768673, dev_info=0x35, fmt=0x1 , args=0x60b97c22) at kernel/printk/printk.c:2029
#17 0x604debad in vprintk_deferred (fmt=0x1 , args=0x60b97c21) at
kernel/printk/printk.c:3079
#18 0x60a554de in printk_deferred (fmt=0x60d895bb 
"\n") at kernel/printk/printk.c:3091
#19 0x6092680f in _warn_unseeded_randomness
(previous=, caller=, func_name=) at drivers/char/random.c:1534
#20 _warn_unseeded_randomness (func_name=0x60abf380 <__func__.38>
"get_random_u32", caller=0x608b5f25 ,
previous=0x35) at drivers/char/random.c:1516
#21 0x60927d47 in get_random_u32 () at
drivers/char/random.c:2221
#22 0x608b5f25 in bucket_table_alloc (nbuckets=64, gfp=3264,
ht=) at lib/rhashtable.c:203
#23 0x608b6733 in rhashtable_init (ht=0x60c60e30
, params=0x608b5e06 ) at
lib/rhashtable.c:1061
#24 0x6080f234 in ipc_init_ids (ids=0x60c60de8 )
at ipc/util.c:119
#25 0x60813c6d in sem_init_ns (ns=0x60d895bb ) at
ipc/sem.c:254
#26 0x60015b5d in sem_init () at ipc/sem.c:268
#27 0x7f89906d92f7 in ?? () from /lib/x86_64-linux-
gnu/libcom_err.so.2
#28 0x7f8990ab8fb2 in call_init (l=,
argc=argc@entry=5, argv=argv@entry=0x7ffe3e7a4c98,
env=env@entry=0x7ffe3e7a4cc8) at dl-init.c:72
#29 0x7f8990ab90b9 in call_init (env=0x7ffe3e7a4cc8,
argv=0x7ffe3e7a4c98, argc=5, l=) at dl-init.c:30
#30 _dl_init (main_map=0x61497ea0, argc=5, argv=0x7ffe3e7a4c98,
env=0x7ffe3e7a4cc8) at dl-init.c:119
#31 0x7f89909d82bd in __GI__dl_catch_exception
(exception=exception@entry=0x0, operate=operate@entry=0x7f8990abc5a0
, args=args@entry=0x7ffe3e7a1e80) at dl-error-
skeleton.c:182
#32 0x7f8990abd028 in dl_open_worker (a=a@entry=0x7ffe3e7a2020) at
dl-open.c:758
#33 0x7f89909d8260 in __GI__dl_catch_exception
(exception=exception@entry=0x7ffe3e7a2000,
operate=operate@entry=0x7f8990abcc70 ,
args=args@entry=0x7ffe3e7a2020) at dl-error-skeleton.c:208
#34 0x7f8990abc8ca in _dl_open (file=0x7ffe3e7a22a0
"libnss_nis.so.2", mode=-2147483646, caller_dlopen=0x7f89909bf3a6
, nsid=-2, argc=5, argv=0x7ffe3e7a2

Bug#983379: linux uml segfault

2021-03-02 Thread Anton Ivanov





On 02/03/2021 14:23, Ritesh Raj Sarraf wrote:

On Tue, 2021-03-02 at 11:34 +, Anton Ivanov wrote:

If gdb gives you the exact lines, that may be helpful.


It doesn't. But it does show drawbacks in my packaging. The debug
symbols packaged are not read/honored by gdb at all.

```
Reading symbols from /usr/bin/linux.uml...
Reading symbols from /usr/lib/debug/.build-
id/6f/ea141539149074c72e80fb8004de124fda115b.debug...
(No debugging symbols found in /usr/lib/debug/.build-
id/6f/ea141539149074c72e80fb8004de124fda115b.debug)

warning: Can't open file /dev/shm/#20817 (deleted) during file-backed
mapping note processing
[New LWP 18788]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-
gnu/libthread_db.so.1".
Core was generated by `linux ubd0=qemu-linux-image.img'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7f51842c0087 in kill () at ../sysdeps/unix/syscall-
template.S:120
120 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0  0x7f51842c0087 in kill () at ../sysdeps/unix/syscall-
template.S:120
#1  0x6049dc20 in uml_abort ()
#2  0x6049de7a in os_dump_core ()
#3  0x60486e47 in panic_exit ()
#4  0x604c0a03 in notifier_call_chain ()
#5  0x604c0a98 in atomic_notifier_call_chain ()
#6  0x60a26b85 in panic ()
#7  0x604869e1 in segv ()
#8  0x60486ba9 in segv_handler ()
#9  0x6049ccc0 in sig_handler_common ()
#10 0x6049d1ec in sig_handler ()
#11 0x6049cdc6 in hard_handler ()
#12 
#13 0x604d45b4 in vprintk_store ()
#14 0x604d4aa8 in vprintk_emit ()
#15 0x604d4d86 in vprintk_deferred ()
#16 0x60a27a02 in printk_deferred ()
#17 0x609031b2 in get_random_u32 ()
#18 0x6088ff65 in bucket_table_alloc.isra ()
#19 0x60890740 in rhashtable_init ()
#20 0x607efaa2 in ipc_init_ids ()
#21 0x600153c9 in sem_init ()
```

So the best I can extract for you is to compile the kernel with as much
information as possible.


Can you try using one of the older kernels so we can verify if this is indeed a 
5.10 thing.

I will do a dissect the moment I figure out how to reproduce it. I will try to 
do some more experiments on that tomorrow.



Thanks,
Ritesh



--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#983379: linux uml segfault

2021-03-02 Thread Anton Ivanov





On 02/03/2021 09:09, Ritesh Raj Sarraf wrote:

On Wed, 2021-02-24 at 11:44 +, Anton Ivanov wrote:

In all cases it boots cleanly and there are no segfaults.

So, frankly, no idea what is causing it to crash - I have run most
combinations of 5.10 on a 5.10, all work fine here.


Is there any other way I can help you with this issue ?
I do have the core dump available on my local machine.


If gdb gives you the exact lines, that may be helpful.

I have looked through the bt several times, it is something through which my 
set-up cruises through.

The actual moment you see in the backtrace is this one:

[0.08] random: get_random_u32 called from 
bucket_table_alloc.isra.0+0x115/0x13d with crng_init=0

However, in your case, instead of getting this printk warning out it blows up.

Why - I don't know.

A.





___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um



--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#983379: linux uml segfault

2021-02-24 Thread Anton Ivanov





On 23/02/2021 17:26, Ritesh Raj Sarraf wrote:

Added the debian bug report in CC.

On Tue, 2021-02-23 at 17:19 +, Anton Ivanov wrote:

The current Debian user-mode-linux package in unstable is based on
the 5.10.5 stable source which includes the mentioned patch, but is
still causing an error for some users.


After updating the tree to 5.10.5 and applying all Debian patches
from the package, I cannot reproduce the bug.

I am running it on 5.10, 5.2 and 4.19 hosts with the same parameters
without issues. Hosts are all up to date Debian 10.8 and so is the
UML userspace.



Did you mean 5.10, 5.2 and 4.19 (UML) guests ?

We've seen this happen on Debian Testing and Unstable Host (of which
the former would soon be the next stable i.e. Debian Bullseye).

In our tests, when running the same linux uml binary (5.10) on a Debian
Stable Host, it is working fine.


I cannot reproduce it on a physical Bullseye host using the Debian 
user-mode-linux package compiled from source.

Environment - Bullseye minimal install and build deps. 6 cores/12 threads Ryzen

I cannot reproduce it using the upstream source and the patches from the 
user-mode-linux package

Environment - same as above.

I cannot reproduce it using the upstream source + patches and compiling on 
Buster using the following:

1. Bullseye physical host, minimal install, same hardware

2. Bullseye VM, minimal install, running with 4 vCPUs on the same host

3. Bullseye LXC container running on a Debian Buster host, minimal install, 
same hardware

In all cases it boots cleanly and there are no segfaults.

So, frankly, no idea what is causing it to crash - I have run most combinations 
of 5.10 on a 5.10, all work fine here.

--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#983379: linux uml segfault

2021-02-23 Thread Anton Ivanov


On 23/02/2021 17:26, Ritesh Raj Sarraf wrote:

Added the debian bug report in CC.

On Tue, 2021-02-23 at 17:19 +, Anton Ivanov wrote:

The current Debian user-mode-linux package in unstable is based on
the 5.10.5 stable source which includes the mentioned patch, but is
still causing an error for some users.


After updating the tree to 5.10.5 and applying all Debian patches
from the package, I cannot reproduce the bug.

I am running it on 5.10, 5.2 and 4.19 hosts with the same parameters
without issues. Hosts are all up to date Debian 10.8 and so is the
UML userspace.



Did you mean 5.10, 5.2 and 4.19 (UML) guests ?


No. Hosts.

I have several 6core/12thread Ryzens which are used for development 
testing.


They all use identical userspace with the sole difference being the 
kernel. They all use a selection of 5.x because 4.19 does not support 
the hardware properly.


The 4.19 testing is done on my old "test farm" which is all A8s and 
Athlon X760.




We've seen this happen on Debian Testing and Unstable Host (of which
the former would soon be the next stable i.e. Debian Bullseye).






In our tests, when running the same linux uml binary (5.10) on a Debian
Stable Host, it is working fine.




OK. I will upgrade one of my systems to Debian testing to try to 
reproduce this.



--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#940821: NFS Caching broken in 4.19.37

2021-02-20 Thread Anton Ivanov

On 20/02/2021 20:04, Salvatore Bonaccorso wrote:

Hi,

On Mon, Jul 08, 2019 at 07:19:54PM +0100, Anton Ivanov wrote:

Hi list,

NFS caching appears broken in 4.19.37.

The more cores/threads the easier to reproduce. Tested with identical
results on Ryzen 1600 and 1600X.

1. Mount an openwrt build tree over NFS v4
2. Run make -j `cat /proc/cpuinfo | grep vendor | wc -l` ; make clean in a
loop
3. Result after 3-4 iterations:

State on the client

ls -laF
/var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm

total 8
drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../

State as seen on the server (mounted via nfs from localhost):

ls -laF
/var/autofs/local/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm
total 12
drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../
-rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h

Actual state on the filesystem:

ls -laF
/exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm
total 12
drwxr-xr-x 2 anivanov anivanov 4096 Jul 8 11:40 ./
drwxr-xr-x 3 anivanov anivanov 4096 Jul 8 11:40 ../
-rw-r--r-- 1 anivanov anivanov 32 Jul 8 11:40 ipcbuf.h

So the client has quite clearly lost the plot. Telling it to drop caches and
re-reading the directory shows the file present.

It is possible to reproduce this using a linux kernel tree too, just takes
much more iterations - 10+ at least.

Both client and server run 4.19.37 from Debian buster. This is filed as
debian bug 931500. I originally thought it to be autofs related, but IMHO it
is actually something fundamentally broken in nfs caching resulting in cache
corruption.

According to the reporter downstream in Debian, at
https://bugs.debian.org/940821#26 thi seem still reproducible with
more recent kernels than the initial reported. Is there anything Anton
can provide to try to track down the issue?

Anton, can you reproduce with current stable series?

100% reproducible with any kernel from 4.9 to 5.4, stable or backports.
It may exist in earlier versions, but I do not have a machine with
anything before 4.9 to test at present.

From 1-2 make clean && make cycles to one afternoon depending on the
number of machine cores. More cores/threads the faster it does it.

I tried playing with protocol minor versions, caching options, etc - it
is still reproducible for any nfs4 settings as long as there is client
side caching of metadata.

Regards,
Salvatore

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/

Bug#940821: closed by Bastian Blank (No response by submitter)

2021-02-20 Thread Anton Ivanov


On 20/02/2021 10:33, Debian Bug Tracking System wrote:

This is an automatic notification regarding your Bug report
which was filed against the src:linux package:

#940821: linux-image-5.2.0-2-amd64: file cache corruption with nfs4

It has been closed by Bastian Blank .

Their explanation is attached below along with your original report.
If this explanation is unsatisfactory and you have not received a
better one in a separate message then please contact Bastian Blank 
 by
replying to this email.



I missed the question. Probably hit the spam bucket for some reason.

I am able to reproduce it with more recent versions as well.

The most recent one I have around is 5.4.0-0.bpo.2-amd64

Still reproducible 100% - just tested it.

It is trivial to reproduce if anyone actually bothers to do so. Just 
grab a big enough tree where make runs truly in parallel - openwrt is 
best, but even the Linux kernel does the job.


Mount it via nfs4 from another server (it will work even locally, but 
takes longer to reproduce - may take a whole afternoon)


Run while make -j 12 clean && make -j 12 ; do true ; done

Leave it to run. On 6 cores/12 threads it takes 2-3 builds of openwrt or 
~ 5-8 linux kernel builds to blow up. More cores - faster. Less cores 
slower.


I sent it to the mailing list too, but nobody could be bothered to even 
ask any questions.


--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#938962: [PATCH] um: Add back support for extra userspace libraries

2019-10-16 Thread Anton Ivanov

PCAP and VDE network transports require linking with userspace
libraries. The current build system has no means of passing these
as arguments.

This patch adds a script to expand the library list for linking
for these transports as well as any future driver that needs to
rely on additional libraries on the userspace side.

Signed-off-by: Anton Ivanov 
---
 arch/um/scripts/extra-libs.sh | 10 ++
 scripts/link-vmlinux.sh   |  4 +++-
 2 files changed, 13 insertions(+), 1 deletion(-)
 create mode 100644 arch/um/scripts/extra-libs.sh

diff --git a/arch/um/scripts/extra-libs.sh b/arch/um/scripts/extra-libs.sh
new file mode 100644
index ..0592485e0675
--- /dev/null
+++ b/arch/um/scripts/extra-libs.sh
@@ -0,0 +1,10 @@
+#!/bin/sh
+
+# This file should be included from link-vmlinux, not executed!!!
+
+if [ "${CONFIG_UML_NET_VDE}" = "y" ] ; then
+   UML_EXTRA_LIBS="$UML_EXTRA_LIBS -lvde -lvdeplug"
+fi
+if [ "${CONFIG_UML_NET_PCAP}" = "y" ] ; then
+   UML_EXTRA_LIBS="$UML_EXTRA_LIBS -lpcap"
+fi
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 06495379fcd8..15f9e5096da0 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -90,11 +90,13 @@ vmlinux_link()
-Wl,--end-group \
${@}"
 
+   . arch/um/scripts/extra-libs.sh
+
${CC} ${CFLAGS_vmlinux} \
-o ${output}\
-Wl,-T,${lds}   \
${objects}  \
-   -lutil -lrt -lpthread
+   -lutil -lrt -lpthread ${UML_EXTRA_LIBS}
rm -f linux
fi
 }
-- 
2.20.1

Bug#938962: [PATCH] um: Fix pcap and vde driver builds

2019-10-16 Thread Anton Ivanov


On 16/10/2019 08:53, Anton Ivanov wrote:

Signed-off-by: Anton Ivanov 
---
  arch/um/drivers/Makefile | 8 
  scripts/link-vmlinux.sh  | 2 +-
  2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/um/drivers/Makefile b/arch/um/drivers/Makefile
index 693319839f69..34355057ec85 100644
--- a/arch/um/drivers/Makefile
+++ b/arch/um/drivers/Makefile
@@ -24,6 +24,14 @@ LDFLAGS_vde.o := -r $(shell $(CC) $(CFLAGS) 
-print-file-name=libvdeplug.a)
  
  targets := pcap_kern.o pcap_user.o vde_kern.o vde_user.o
  
+ifeq ($(CONFIG_UML_NET_PCAP),y)

+   export UML_EXTRA_LIBS += -lpcap
+endif
+ifeq ($(CONFIG_UML_NET_VDE),y)
+   export UML_EXTRA_LIBS += -lvde -lvdeplug
+endif
+
+
  $(obj)/pcap.o: $(obj)/pcap_kern.o $(obj)/pcap_user.o
$(LD) -r -dp -o $@ $^ $(ld_flags)
  
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh

index 915775eb2921..d3e6a6cdfc13 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -86,7 +86,7 @@ vmlinux_link()
${CC} ${CFLAGS_vmlinux} -o ${2} \
-Wl,-T,${lds}   \
${objects}  \
-   -lutil -lrt -lpthread
+   -lutil -lrt -lpthread ${UML_EXTRA_LIBS}
rm -f linux
fi
  }



This will not work as advertised unfortunately - I have to write out the 
libs list somewhere and load it again in the link script instead of 
passing it as an environment variable.


A fixed patch will follow shortly.

--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/

Bug#938962: [PATCH] um: Fix pcap and vde driver builds

2019-10-16 Thread Anton Ivanov

Signed-off-by: Anton Ivanov 
---
 arch/um/drivers/Makefile | 8 
 scripts/link-vmlinux.sh  | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/um/drivers/Makefile b/arch/um/drivers/Makefile
index 693319839f69..34355057ec85 100644
--- a/arch/um/drivers/Makefile
+++ b/arch/um/drivers/Makefile
@@ -24,6 +24,14 @@ LDFLAGS_vde.o := -r $(shell $(CC) $(CFLAGS) 
-print-file-name=libvdeplug.a)
 
 targets := pcap_kern.o pcap_user.o vde_kern.o vde_user.o
 
+ifeq ($(CONFIG_UML_NET_PCAP),y)
+   export UML_EXTRA_LIBS += -lpcap
+endif
+ifeq ($(CONFIG_UML_NET_VDE),y)
+   export UML_EXTRA_LIBS += -lvde -lvdeplug
+endif
+
+
 $(obj)/pcap.o: $(obj)/pcap_kern.o $(obj)/pcap_user.o
$(LD) -r -dp -o $@ $^ $(ld_flags)
 
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 915775eb2921..d3e6a6cdfc13 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -86,7 +86,7 @@ vmlinux_link()
${CC} ${CFLAGS_vmlinux} -o ${2} \
-Wl,-T,${lds}   \
${objects}  \
-   -lutil -lrt -lpthread
+   -lutil -lrt -lpthread ${UML_EXTRA_LIBS}
rm -f linux
fi
 }
-- 
2.20.1

Bug#938962: Build fix for VDE and PCAP drivers

2019-10-16 Thread Anton Ivanov

Hi all,

A patch to fix the build for these follows.

I will stick to my original suggestion - pcap should be obsoleted in favour of
vector raw + BPF firmware load. The latter will work on interfaces where gso/gro
is enabled. The original pcap will fail on that due to the 1500 bytes size limit
in the legacy net code.

I had to dig the root cause here and figure out what is going on while working
on an AF_XDP transport as that had the same problem - it needed to pass -lbpf
-lelf -lz which could not be passed under the current build system.

A.

Bug#938962: [PATCH] um: Loadable BPF "Firmware" for vector drivers

2019-10-01 Thread Anton Ivanov





On 01/10/2019 08:50, Johannes Berg wrote:

On Mon, 2019-09-30 at 14:19 +0100, Anton Ivanov wrote:

All vector drivers now allow a BPF program to be loaded and
associated with the RX socket in the host kernel.

1. The program can be loaded as an extra kernel command line
option to any of the drivers.

2. The program can also be loaded as "firmware", using the
ethtool flash option. It is possible to turn this facility
on or off using a command line option.

A simplistic wrapper for generating the BPF firmware for the raw
socket driver out of a tcpdump/libpcap filter expression can be
found at: https://github.com/kot-begemot-uk/uml_vector_utilities/


That's kinda cool.

Why just BPF though, not eBPF with all that brings?


The filter language for the SOCKOPT is specified as BPF everywhere. I 
have not looked at what the sockopt does in the host kernel under the 
hood and if it takes eBPF.


Also, the intention is to provide backward compatible wrappers for the 
existing pcap functionality as per the Debian bug which is cc-ed and 
that generates/uses basic BPF out of a pcap expression. We can add those 
to the "uml-utilities" package present in Debian and other distros.


I will try to get around and write a wrapper which takes legacy UML 
network interface arguments and rewrites them as options for the new 
vector drivers.




Or is that because the BPF filter is actually attached to the socket in
the host, if I'm reading this correctly?


Yes. The idea is to offload it from the guest to the host. I have had 
this idea as well as some PoC code to do that since like 2012. (e)BPF is 
an excellent way to represent "firmware" for vNICs, I am surprised it is 
not in active use :)


It should be possible to expand the concept for other stuff like AF_XDP, 
etc but I need to get around to implement that in the first place.





Couple of style nits below:


+static bool get_bpf_flash(struct arglist *def)
+{
+   return uml_vector_fetch_arg(def, "bpfflash") != NULL;
+}
+
+


Needs just one blank line?


@@ -1125,6 +1142,7 @@ static int vector_net_close(struct net_device *dev)
netif_stop_queue(dev);
del_timer(>tl);
  
+

if (vp->fds == NULL)
return 0;


not needed
  

@@ -1139,6 +1157,8 @@ static int vector_net_close(struct net_device *dev)
}
tasklet_kill(>tx_poll);
if (vp->fds->rx_fd > 0) {
+   if (vp->bpf)
+   uml_vector_detach_bpf(vp->fds->rx_fd, vp->bpf);
os_close_file(vp->fds->rx_fd);
vp->fds->rx_fd = -1;
}


I guess you moved some code here or something and the blank line was
left?


+/*
+ * We cannot use the firmware.c loader API here because this is not a module
+ *  and we do not have a proper device structure to pass to it as required
+ *  by the firmware API
+ */


You just have to make up a platform device, see e.g. net/wireless/reg.c.
IMHO better than open-coding all this.


Good idea.




@@ -1528,8 +1618,9 @@ static void vector_eth_configure(
.in_write_poll  = false,
.coalesce   = 2,
.req_size   = get_req_size(def),
-   .in_error   = false
-   });
+   .in_error   = false,
+   .bpf= NULL
+   });


That's not really needed, but I guess you have everything here anyway.


+int uml_vector_detach_bpf(int fd, void *bpf)
+{
+   struct sock_fprog *prog = bpf;
+
+   int err = setsockopt(fd, SOL_SOCKET, SO_DETACH_FILTER, bpf, 
sizeof(struct sock_fprog));


Spurious blank line, line too long.
  

-void *uml_vector_default_bpf(int fd, void *mac)
+   if (err < 0)
+   printk(KERN_ERR BPF_DETACH_FAIL, prog->len, prog->filter, fd, 
-errno);


also looks pretty long


+   return err;
+}
+void *uml_vector_default_bpf(void *mac)
  {
struct sock_filter *bpf;
uint32_t *mac1 = (uint32_t *)(mac + 2);
uint16_t *mac2 = (uint16_t *) mac;
-   struct sock_fprog bpf_prog = {
-   .len = 6,
-   .filter = NULL,
-   };
+   struct sock_fprog *bpf_prog;
  
+	bpf_prog = uml_kmalloc(sizeof(struct sock_fprog), UM_GFP_KERNEL);

+   if (bpf_prog != NULL) {


generally, kernel coding style prefers to remove " != NULL" (per
checkpatch, anyway)


+   bpf_prog->len = DEFAULT_BPF_LEN;
+   bpf_prog->filter = NULL;
+   } else
+   return NULL;


and braces on all branches of if statements

johannes



Ack - I will look at the other bits, thanks for reviewing it.



___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um



--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/

Bug#938962: [PATCH] um: Loadable BPF "Firmware" for vector drivers

2019-09-30 Thread Anton Ivanov

All vector drivers now allow a BPF program to be loaded and
associated with the RX socket in the host kernel.

1. The program can be loaded as an extra kernel command line
option to any of the drivers.

2. The program can also be loaded as "firmware", using the
ethtool flash option. It is possible to turn this facility
on or off using a command line option.

A simplistic wrapper for generating the BPF firmware for the raw
socket driver out of a tcpdump/libpcap filter expression can be
found at: https://github.com/kot-begemot-uk/uml_vector_utilities/

Signed-off-by: Anton Ivanov 
---
 arch/um/drivers/vector_kern.c | 109 +++---
 arch/um/drivers/vector_kern.h |   8 ++-
 arch/um/drivers/vector_user.c |  94 +++--
 arch/um/drivers/vector_user.h |   8 ++-
 4 files changed, 190 insertions(+), 29 deletions(-)

diff --git a/arch/um/drivers/vector_kern.c b/arch/um/drivers/vector_kern.c
index af27d5c41776..7453b99ac1d2 100644
--- a/arch/um/drivers/vector_kern.c
+++ b/arch/um/drivers/vector_kern.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2017 - Cambridge Greys Limited
+ * Copyright (C) 2017 - 2019 Cambridge Greys Limited
  * Copyright (C) 2011 - 2014 Cisco Systems Inc
  * Copyright (C) 2001 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com)
  * Copyright (C) 2001 Lennert Buytenhek (buyt...@gnu.org) and
@@ -21,6 +21,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -128,6 +131,17 @@ static int get_mtu(struct arglist *def)
return ETH_MAX_PACKET;
 }
 
+static char *get_bpf_file(struct arglist *def)
+{
+   return uml_vector_fetch_arg(def, "bpffile");
+}
+
+static bool get_bpf_flash(struct arglist *def)
+{
+   return uml_vector_fetch_arg(def, "bpfflash") != NULL;
+}
+
+
 static int get_depth(struct arglist *def)
 {
char *mtu = uml_vector_fetch_arg(def, "depth");
@@ -176,6 +190,7 @@ static int get_transport_options(struct arglist *def)
int vec_rx = VECTOR_RX;
int vec_tx = VECTOR_TX;
long parsed;
+   int result = 0;
 
if (vector != NULL) {
if (kstrtoul(vector, 10, ) == 0) {
@@ -186,14 +201,16 @@ static int get_transport_options(struct arglist *def)
}
}
 
+   if (get_bpf_flash(def))
+   result = VECTOR_BPF_FLASH;
 
if (strncmp(transport, TRANS_TAP, TRANS_TAP_LEN) == 0)
-   return 0;
+   return result;
if (strncmp(transport, TRANS_HYBRID, TRANS_HYBRID_LEN) == 0)
-   return (vec_rx | VECTOR_BPF);
+   return (result | vec_rx | VECTOR_BPF);
if (strncmp(transport, TRANS_RAW, TRANS_RAW_LEN) == 0)
-   return (vec_rx | vec_tx | VECTOR_QDISC_BYPASS);
-   return (vec_rx | vec_tx);
+   return (result | vec_rx | vec_tx | VECTOR_QDISC_BYPASS);
+   return (result | vec_rx | vec_tx);
 }
 
 
@@ -1125,6 +1142,7 @@ static int vector_net_close(struct net_device *dev)
netif_stop_queue(dev);
del_timer(>tl);
 
+
if (vp->fds == NULL)
return 0;
 
@@ -1139,6 +1157,8 @@ static int vector_net_close(struct net_device *dev)
}
tasklet_kill(>tx_poll);
if (vp->fds->rx_fd > 0) {
+   if (vp->bpf)
+   uml_vector_detach_bpf(vp->fds->rx_fd, vp->bpf);
os_close_file(vp->fds->rx_fd);
vp->fds->rx_fd = -1;
}
@@ -1146,7 +1166,10 @@ static int vector_net_close(struct net_device *dev)
os_close_file(vp->fds->tx_fd);
vp->fds->tx_fd = -1;
}
+   if (vp->bpf != NULL)
+   kfree(vp->bpf->filter);
kfree(vp->bpf);
+   vp->bpf = NULL;
kfree(vp->fds->remote_addr);
kfree(vp->transport_data);
kfree(vp->header_rxbuffer);
@@ -1196,6 +1219,8 @@ static int vector_net_open(struct net_device *dev)
vp->opened = true;
spin_unlock_irqrestore(>lock, flags);
 
+   vp->bpf = uml_vector_user_bpf(get_bpf_file(vp->parsed));
+
vp->fds = uml_vector_user_open(vp->unit, vp->parsed);
 
if (vp->fds == NULL)
@@ -1267,8 +1292,11 @@ static int vector_net_open(struct net_device *dev)
if (!uml_raw_enable_qdisc_bypass(vp->fds->rx_fd))
vp->options |= VECTOR_BPF;
}
-   if ((vp->options & VECTOR_BPF) != 0)
-   vp->bpf = uml_vector_default_bpf(vp->fds->rx_fd, dev->dev_addr);
+   if (((vp->options & VECTOR_BPF) != 0) && (vp->bpf == NULL))
+   vp->bpf = uml_vector_default_bpf(dev->dev_addr);
+
+   if (vp->bpf != NULL)
+   uml_vector_attach_bpf(vp->fds->rx_fd, vp->bpf);
 
netif_start_queue(dev);
 
@@ -1347,6 +1375,67 @@ static v

Bug#940821: linux-image-5.2.0-2-amd64: file cache corruption with nfs4

2019-09-20 Thread Anton Ivanov

Package: src:linux
Version: 5.2.9-2
Severity: critical
Justification: breaks unrelated software

Dear Maintainer,

NFSv4 caching is completely broken on SMP.

How to reproduce:

Option 1. clone openwrt, run while make clean && make -j `nproc` ; do true ; 
done

It will break depending on number of CPUs within several runs. 

Symptoms of breakage. A directory on the client looks empty. Example (mnt is an 
NFSv4 mount):

ls -laF 
/mnt/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm
total 8
drwxr-xr-x 2 anivanov anivanov 4096 Sep 20 10:51 ./
drwxr-xr-x 3 anivanov anivanov 4096 Sep 20 10:51 ../

While it actually has a file in it (same on server):

ls -laF 
/exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm
total 12
drwxr-xr-x 2 anivanov anivanov 4096 Sep 20 10:51 ./
drwxr-xr-x 3 anivanov anivanov 4096 Sep 20 10:51 ../
-rw-r--r-- 1 anivanov anivanov   32 Sep 20 10:51 ipcbuf.h

This cache entry on the client does not expire as it should per the NFSv4 
caching documentation - the only way of dealing with it is reboot, unmount or 
caches drop.

Option 2. Have your $HOME on nfsv4 and use thunderbird. Move mails between 
folders. Sooner or later (usually sooner) you will lose an email.

So this is both "breaks unrelated software" and "data loss" depending on what 
you are doing.

Tested on:

AMD Ryzen 5 2400G, AMD Ryzen 5 1600X, AMD Ryzen 5 1600, AMD A8-6500

Shows up on all. Fastest on the 6 core 12 thread ryzens, slowest on the AMD A8 
(takes up to 3 iterations of make there).

Brgds,

A.

-- Package-specific info:
** Version:
Linux version 5.2.0-2-amd64 (debian-ker...@lists.debian.org) (gcc version 8.3.0 
(Debian 8.3.0-21)) #1 SMP Debian 5.2.9-2 (2019-08-21)

** Command line:
BOOT_IMAGE=/boot/vmlinuz-5.2.0-2-amd64 
root=UUID=8eb17efb-6574-42d0-885e-487b98364059 ro mitigations=off noht quiet

** Not tainted

** Kernel log:
[3.684402] input: HD-Audio Generic Front Mic as 
/devices/pci:00/:00:08.1/:09:00.3/sound/card0/input8
[3.684490] input: HD-Audio Generic Rear Mic as 
/devices/pci:00/:00:08.1/:09:00.3/sound/card0/input9
[3.684555] input: HD-Audio Generic Line as 
/devices/pci:00/:00:08.1/:09:00.3/sound/card0/input10
[3.685553] input: HD-Audio Generic Line Out as 
/devices/pci:00/:00:08.1/:09:00.3/sound/card0/input11
[3.685627] input: HD-Audio Generic Front Headphone as 
/devices/pci:00/:00:08.1/:09:00.3/sound/card0/input12
[3.806626] kvm: Nested Virtualization enabled
[3.806636] kvm: Nested Paging enabled
[3.806637] SVM: Virtual VMLOAD VMSAVE supported
[3.806637] SVM: Virtual GIF supported
[3.820371] MCE: In-kernel MCE decoding enabled.
[3.824533] EDAC amd64: Node 0: DRAM ECC disabled.
[3.824536] EDAC amd64: ECC disabled in the BIOS or no ECC capability, 
module will not load.
Either enable ECC checking or force module loading by setting 
'ecc_enable_override'.
(Note that use of the override may cause unknown side effects.)
[3.872569] pktcdvd: pktcdvd0: writer mapped to sr0
[3.900858] EDAC amd64: Node 0: DRAM ECC disabled.
[3.900860] EDAC amd64: ECC disabled in the BIOS or no ECC capability, 
module will not load.
Either enable ECC checking or force module loading by setting 
'ecc_enable_override'.
(Note that use of the override may cause unknown side effects.)
[3.948661] EDAC amd64: Node 0: DRAM ECC disabled.
[3.948662] EDAC amd64: ECC disabled in the BIOS or no ECC capability, 
module will not load.
Either enable ECC checking or force module loading by setting 
'ecc_enable_override'.
(Note that use of the override may cause unknown side effects.)
[3.996651] EDAC amd64: Node 0: DRAM ECC disabled.
[3.996652] EDAC amd64: ECC disabled in the BIOS or no ECC capability, 
module will not load.
Either enable ECC checking or force module loading by setting 
'ecc_enable_override'.
(Note that use of the override may cause unknown side effects.)
[4.002382] audit: type=1400 audit(1568973482.655:2): apparmor="STATUS" 
operation="profile_load" profile="unconfined" name="libreoffice-xpdfimport" 
pid=706 comm="apparmor_parser"
[4.002712] audit: type=1400 audit(1568973482.655:3): apparmor="STATUS" 
operation="profile_load" profile="unconfined" name="libreoffice-senddoc" 
pid=701 comm="apparmor_parser"
[4.005254] audit: type=1400 audit(1568973482.659:4): apparmor="STATUS" 
operation="profile_load" profile="unconfined" name="libreoffice-oopslash" 
pid=699 comm="apparmor_parser"
[4.007555] audit: type=1400 audit(1568973482.659:5): apparmor="STATUS" 
operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=702 
comm="apparmor_parser"
[4.007558] audit: type=1400 audit(1568973482.659:6):

Bug#938962: user-mode-linux needs update for new linux

2019-09-18 Thread Anton Ivanov


On 12/09/2019 15:42, Anton Ivanov wrote:



On 12/09/2019 13:14, Ritesh Raj Sarraf wrote:

Hi,

I am not sure if this has been reported upstream but with libpcap 1.9,
user mode linux fails to build. The build failure happens with both,
5.2 and 4.19 LTS kernels.

A much detailed report is available at:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=938962

libpcap 1.9 introduces `pcap_open` which is also declared in linux
headers in arch/um/drivers/pcap_user.c





I think the best way forward here is to kill the old libpcap driver 
altogether.


You get the same functionality from vector raw including the ability to 
load a bpf filter.


The only thing that needs is a wrapper to compile the filter before 
handing it to UML.


A side effect is that it is ~ 10+ time faster - in the multigigabit range.

Alternatively, I can wrap it so it looks like pcap to any existing 
scripts and is actually vector underneath, but that will lose some of 
the tunables, like offloads, vector depth, etc.




Thanks,
Ritesh

On Sat, 2019-09-07 at 17:18 +0200, Romain Francoise wrote:

Hi,

On Tue, Sep 3, 2019 at 3:21 PM Ritesh Raj Sarraf 
wrote:
[...]

In file included from /usr/include/pcap.h:43,
  from arch/um/drivers/pcap_user.c:7:
/usr/include/pcap/pcap.h:835:18: note: previous declaration of
‘pcap_open’ was here
  PCAP_API pcap_t *pcap_open(const char *source, int snaplen, int
flags,
   ^
make[2]: *** [scripts/Makefile.build:309:
arch/um/drivers/pcap_user.o] Error 1


libpcap 1.9 includes support for remote capture, which was originally
a part of WinPcap extensions. The `pcap_open()' symbol is part of
that
API and that's why it's defined in the header file even though remote
support is not enabled in Debian. I suggest you rename the function
defined in your program so that it doesn't conflict with libpcap.

___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um




I am going to try to write a wrapper to form arguments for the current 
vector raw driver and if there is something that needs to be fixed in it.


I will post is as a proposed patch vs the debian package once  its ready.

Brgds,

A

--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#938962: user-mode-linux needs update for new linux

2019-09-12 Thread Anton Ivanov





On 12/09/2019 13:14, Ritesh Raj Sarraf wrote:

Hi,

I am not sure if this has been reported upstream but with libpcap 1.9,
user mode linux fails to build. The build failure happens with both,
5.2 and 4.19 LTS kernels.

A much detailed report is available at:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=938962

libpcap 1.9 introduces `pcap_open` which is also declared in linux
headers in arch/um/drivers/pcap_user.c





I think the best way forward here is to kill the old libpcap driver 
altogether.


You get the same functionality from vector raw including the ability to 
load a bpf filter.


The only thing that needs is a wrapper to compile the filter before 
handing it to UML.


A side effect is that it is ~ 10+ time faster - in the multigigabit range.

Alternatively, I can wrap it so it looks like pcap to any existing 
scripts and is actually vector underneath, but that will lose some of 
the tunables, like offloads, vector depth, etc.




Thanks,
Ritesh

On Sat, 2019-09-07 at 17:18 +0200, Romain Francoise wrote:

Hi,

On Tue, Sep 3, 2019 at 3:21 PM Ritesh Raj Sarraf 
wrote:
[...]

In file included from /usr/include/pcap.h:43,
  from arch/um/drivers/pcap_user.c:7:
/usr/include/pcap/pcap.h:835:18: note: previous declaration of
‘pcap_open’ was here
  PCAP_API pcap_t *pcap_open(const char *source, int snaplen, int
flags,
   ^
make[2]: *** [scripts/Makefile.build:309:
arch/um/drivers/pcap_user.o] Error 1


libpcap 1.9 includes support for remote capture, which was originally
a part of WinPcap extensions. The `pcap_open()' symbol is part of
that
API and that's why it's defined in the header file even though remote
support is not enabled in Debian. I suggest you rename the function
defined in your program so that it doesn't conflict with libpcap.

___
linux-um mailing list
linux...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um


--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#939877: closed by Josue Ortega (Bug#939877: fixed in rpcbind 1.2.5-7)

2019-09-10 Thread Anton Ivanov


I missed the actual receive line in the 1.2.5-7 apologies.

It alone DOES Not fix it though.

There is breakage in libwrap to accompany it.

Once the fix in 1.2.5-7 is in, rpcbind starts receiving (according to 
strace) messages which is followed by interrogating addresses and 
interfaces by netlink.


As I do not see any netlink references anywhere in the rpcbind or the 
libtirpc-dev, I believe this is wrap which now has broken broadcast 
check. So anything compiled with wrap which needs to receive broadcasts 
need to be set as ALL:ALL in hosts.allow - otherwise it is dropped.


Upgrading to both 1.2.5-7 _AND_ setting hosts.allow to ALL:ALL provides 
a viable workaround.


The remaining part of this bug is libwrap, you can refile it vs that.

Best Regards,

--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#939877: closed by Josue Ortega (Bug#939877: fixed in rpcbind 1.2.5-7)

2019-09-10 Thread Anton Ivanov


That's not it.

Same story with 1.2.5-7 from unstable.

This is after NIS restart on the client on the NIS server:

root@jain:# tcpdump -nvvv -i enp7s0f1.502 udp and port 111

tcpdump: listening on enp7s0f1.502, link-type EN10MB (Ethernet), capture 
size 262144 bytes


    192.168.20.41.36268 > 192.168.20.63.111: [udp sum ok] UDP, length 92
09:02:57.820457 IP (tos 0x0, ttl 64, id 55627, offset 0, flags [DF], 
proto UDP (17), length 120)

    192.168.20.41.36268 > 192.168.20.63.111: [udp sum ok] UDP, length 92
09:03:03.826888 IP (tos 0x0, ttl 64, id 55969, offset 0, flags [DF], 
proto UDP (17), length 120)


And on - the RPC retransmits to broadcast address (63 on this subnet it 
is /26)


Traffic only one way, strace on rpcbind shows only netlink messages, no 
udp recv


Same thing after setting a nis server address on the client and 
restarting nis - immediate response


tcpdump -nvvv -i enp7s0f1.502 udp and port 111

   192.168.20.41.800 > 192.168.3.3.111: [udp sum ok] UDP, length 56
09:05:00.429940 IP (tos 0x0, ttl 64, id 22755, offset 0, flags [DF], 
proto UDP (17), length 56)
    192.168.3.3.111 > 192.168.20.41.800: [bad udp cksum 0x98b2 -> 
0x1245!] UDP,


strace of the rpcbind process

sendmsg(6, {msg_name={sa_family=AF_INET, sin_port=htons(800), 
sin_addr=inet_addr("192.168.20.41")}, msg_namelen=16, 
msg_iov=[{iov_base=".{\272q\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\2\265", 
iov_len=28}], msg_iovlen=1, msg_control=[{cmsg_len=28, 
cmsg_level=SOL_IP, cmsg_type=IP_PKTINFO, cmsg_data={ipi_ifindex=0, 
ipi_spec_dst=inet_addr("192.168.3.3"), 
ipi_addr=inet_addr("192.168.3.3")}}], msg_controllen=32, msg_flags=0}, 
0) = 28


That line (strace) never occurs in the broadcast case.

It simply is not listening to broadcast queries.

I will try to wade through the source to see exactly how it manages it, 
because listening on INADDR_ANY should in theory get you broadcasts.



On 09/09/2019 22:00, Debian Bug Tracking System wrote:

This is an automatic notification regarding your Bug report
which was filed against the rpcbind package:

#939877: rpcbind: Does not receive any broadcast queries resulting in complete 
breakage of NIS

It has been closed by Josue Ortega .

Their explanation is attached below along with your original report.
If this explanation is unsatisfactory and you have not received a
better one in a separate message then please contact Josue Ortega 
 by
replying to this email.



--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#939877: rpcbind: Does not receive any broadcast queries resulting in complete breakage of NIS

2019-09-09 Thread Anton Ivanov

Package: rpcbind
Version: 1.2.5-0.3
Severity: grave
Justification: renders package unusable

Dear Maintainer,

After an upgrade to buster rpcbind no longer receives any broadcast queries. 
Unicast works.

This is verified via strace - it has occasional netlink messages, but any of 
the broadcast
traffic to port 111 never hit it.

As a result clients can no longer find a nis server which has been upgraded to 
buster. 

-- System Information:
Debian Release: 10.1
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.19.0-6-amd64 (SMP w/8 CPU cores)
Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_GB:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages rpcbind depends on:
ii  adduser  3.118
ii  libc62.28-10
ii  libsystemd0  241-7~deb10u1
ii  libtirpc31.1.4-0.4
ii  libwrap0 7.6.q-28
ii  lsb-base 10.2019051400

rpcbind recommends no packages.

rpcbind suggests no packages.

-- no debconf information

Bug#926305: closed by Elimar Riesebieter (Re: Bug#926305: nis startup scripts are completely broken)

2019-04-18 Thread Anton Ivanov


Please reopen.

Advice is no replacement for a Depends in the package control file.

As shipped the package is still broken and at the reported severity - 
breaking most of the system


A.

On 18/04/2019 14:48, Debian Bug Tracking System wrote:

This is an automatic notification regarding your Bug report
which was filed against the nis package:

#926305: nis startup scripts are completely broken

It has been closed by Elimar Riesebieter .

Their explanation is attached below along with your original report.
If this explanation is unsatisfactory and you have not received a
better one in a separate message then please contact Elimar Riesebieter 
 by
replying to this email.



--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#926305: nis startup scripts are completely broken

2019-04-18 Thread Anton Ivanov


That is not an advice.

If nscd is a required dependency, NIS should bring it in.

Presently it is not.

Still broken

A.

On 18/04/2019 14:43, Elimar Riesebieter wrote:

* Elimar Riesebieter  [2019-04-03 11:06 +0200]:


* Anton Ivanov  [2019-04-03 09:43 +0100]:


Package: nis
Version: 3.17.1-3+b1
Severity: critical
Justification: breaks unrelated software

Dear Maintainer,

Startup scripts are completely broken. Something in the systemd 
conversion/autogeneration.

The ypbind binary is never started, the script goes into "backgrounded" and 
fails. From there
on the system is unusable - you cannot log in, UIDs and groups do not resolve, 
etc.

The same system operated correctly before buster upgrade and will operate 
correctly if
ypbind is invoked from the command line.

This looks like a pure systemd conversion issue of some sort.

At my systems installing nscd helped. As well setting "YPBINDARGS="
in /etc/default/nis must be.

This bug should be closed as there is no response from the reporter.
As well it seems to be fixed following the advices given above,
though.

Elimar


--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Bug#926305: nis startup scripts are completely broken

2019-04-03 Thread Anton Ivanov

Package: nis
Version: 3.17.1-3+b1
Severity: critical
Justification: breaks unrelated software

Dear Maintainer,

Startup scripts are completely broken. Something in the systemd 
conversion/autogeneration.

The ypbind binary is never started, the script goes into "backgrounded" and 
fails. From there
on the system is unusable - you cannot log in, UIDs and groups do not resolve, 
etc.

The same system operated correctly before buster upgrade and will operate 
correctly if
ypbind is invoked from the command line.

This looks like a pure systemd conversion issue of some sort.

-- Package-specific info:

NIS domain: home 

-- System Information:
Debian Release: buster/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 4.19.0-4-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_GB:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages nis depends on:
ii  debconf [debconf-2.0]  1.5.71
ii  hostname   3.21
ii  libc6  2.28-8
ii  libgdbm6   1.18.1-4
ii  libsystemd0241-1
ii  lsb-base   10.2019031300
ii  make   4.2.1-1.2
ii  netbase5.6
ii  rpcbind [portmap]  1.2.5-0.3

nis recommends no packages.

Versions of packages nis suggests:
pn  nscd  

-- Configuration Files:
/etc/yp.conf changed [not included]

-- debconf information:
* nis/domain: home

Bug#878046: amanda-server: Fails all backups if one or more hosts are down

2017-10-22 Thread Anton Ivanov

I am OK to wait for the upload 

On 22 October 2017 13:26:56 EEST, Jose M Calhariz <j...@calhariz.com> wrote:
>That is an old problem of amanda that is solved on v3.5.  But the error
>messages are usually different from what you see.
>
>I have been working on a new package that I should upload very shortly,
>to sid and backports.  If you are dead on water I
>can provide my working in progress packages for stretch on amd64.
>
>Kind regards
>Jose M Calhariz
>
>On 09/10/17 06:55, Anton Ivanov wrote:
>> Package: amanda-server
>> Version: 1:3.3.9-5
>> Severity: grave
>> Justification: renders package unusable
>>
>> Dear Maintainer,
>>
>> If one or more backup host is unreachable, the backup of all hosts
>fails.
>>
>> Example - backing up two hosts - smaug and TerriblTerror:
>>
>> If the latter is unreachable
>>
>>   TerribleTerror1 /etc lev 0  FAILED [Request to TerribleTerror1
>failed: Connection timed out]
>>
>> The former (and all other hosts in the backup sequence) fail with:
>>
>>   smaug /exports/md0/home/aivanov lev 0  FAILED [Request to smaug
>failed: error sending REQ: write error to: Broken pipe]
>>
>> -- System Information:
>> Debian Release: 9.0
>>   APT prefers stable
>>   APT policy: (500, 'stable')
>> Architecture: amd64 (x86_64)
>>
>> Kernel: Linux 4.9.0-3-amd64 (SMP w/4 CPU cores)
>> Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8),
>LANGUAGE=en_GB:en (charmap=UTF-8)
>> Shell: /bin/sh linked to /bin/dash
>> Init: systemd (via /run/systemd/system)
>>
>> Versions of packages amanda-server depends on:
>> ii  amanda-common  1:3.3.9-5
>> ii  bsd-mailx [mailx]  8.1.2-0.20160123cvs-4
>> ii  libc6  2.24-11+deb9u1
>> ii  libcurl3   7.52.1-5
>> ii  libglib2.0-0   2.50.3-2
>> ii  libssl1.1  1.1.0f-3
>> ii  perl   5.24.1-3
>>
>> amanda-server recommends no packages.
>>
>> Versions of packages amanda-server suggests:
>> ii  amanda-client  1:3.3.9-5
>> ii  cpio   2.11+dfsg-6
>> ii  gnuplot5.0.5+dfsg1-6
>> ii  mt-st  1.3-1
>>
>> -- no debconf information

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Bug#878046: amanda-server: Fails all backups if one or more hosts are down

2017-10-09 Thread Anton Ivanov

Package: amanda-server
Version: 1:3.3.9-5
Severity: grave
Justification: renders package unusable

Dear Maintainer,

If one or more backup host is unreachable, the backup of all hosts fails.

Example - backing up two hosts - smaug and TerriblTerror:

If the latter is unreachable

  TerribleTerror1 /etc lev 0  FAILED [Request to TerribleTerror1 failed: 
Connection timed out]

The former (and all other hosts in the backup sequence) fail with:

  smaug /exports/md0/home/aivanov lev 0  FAILED [Request to smaug failed: error 
sending REQ: write error to: Broken pipe]

-- System Information:
Debian Release: 9.0
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.9.0-3-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8), LANGUAGE=en_GB:en 
(charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages amanda-server depends on:
ii  amanda-common  1:3.3.9-5
ii  bsd-mailx [mailx]  8.1.2-0.20160123cvs-4
ii  libc6  2.24-11+deb9u1
ii  libcurl3   7.52.1-5
ii  libglib2.0-0   2.50.3-2
ii  libssl1.1  1.1.0f-3
ii  perl   5.24.1-3

amanda-server recommends no packages.

Versions of packages amanda-server suggests:
ii  amanda-client  1:3.3.9-5
ii  cpio   2.11+dfsg-6
ii  gnuplot5.0.5+dfsg1-6
ii  mt-st  1.3-1

-- no debconf information

Bug#844584: dhclient should perform additional validity checks

2016-11-17 Thread Anton Ivanov

Package: isc-dhcp-client
Version: 4.3.1-6+deb8u2
Severity: serious
File: /sbin/dhclient
Tags: security

https://samy.pl/poisontap/

This is a variation on an ancient "gem" by a DSL Modem vendor
where the router pretends to be the entire internet by spoofing
arp so that it captures all traffic.

The best way to deal with this is to set an upper limit on the
size of acceptable netmask in /etc/default/isc-dhcp-client and
verify it in a hook (which can be debian specific).

This way dhcp reply of 0.0.0.0/0 or anything larger than a class 
A will raise a security alert instead of blindly exposing the
machine to a spoofing attack.


-- System Information:
Debian Release: 8.6
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.16.0-4-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages isc-dhcp-client depends on:
ii  debianutils   4.4+b1
ii  iproute2  3.16.0-2
ii  isc-dhcp-common   4.3.1-6+deb8u2
ii  libc6 2.19-18+deb8u6
ii  libdns-export100  1:9.9.5.dfsg-9+deb8u7
ii  libirs-export91   1:9.9.5.dfsg-9+deb8u7
ii  libisc-export95   1:9.9.5.dfsg-9+deb8u7

isc-dhcp-client recommends no packages.

Versions of packages isc-dhcp-client suggests:
pn  avahi-autoipd  
pn  resolvconf 

-- no debconf information

Bug#798178: warzone2100: Major regressions compared to squeeze

2015-09-06 Thread Anton Ivanov

Package: warzone2100
Version: 3.1.1-1
Severity: grave
Justification: renders package unusable

Dear Maintainer,

The new version is unplayable. 

1. Units produced during a remote mission instead of being delivered to 
the factory delivery area of the factory producing them are locked in a 
rock somewhere off-map rendering them unusable as well as rendering most 
preparation strategies unusable. This is an obvious bug and it worked
correctly in the squeeze version.

2. The "nudge your neigbour" in the unit obstacle avoidance algorithm 
does not work. The result is unit deadlock, because units that need to
"give you way" for you to get past them just sit and wait unless moved
manually.

This can be resolved only by picking every unit separately, and moving
them "by hand". They definitely cannot be moved in a formation any more. 
This renders commanders, sensors, etc mostly unusable. You now cannot
retreat a group under command because the "subordinates" will not move 
out of the way for the commander to pass. They will also not move out 
of the way for any damaged units to go for repair. 

Again - this worked in squeeze.

Frankly, can we have the squeeze version recompiled and released as an 
"update", this "improvement" is unplayable.

I am definitely recompiling it locally from squeeze sources as the main 
users (the kids) are revolting that this is unusable.

-- System Information:
Debian Release: 8.1
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 3.16.0-4-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages warzone2100 depends on:
ii  libc6 2.19-18
ii  libfontconfig12.11.0-6.3
ii  libfreetype6  2.5.2-3
ii  libfribidi0   0.19.6-3
ii  libgcc1   1:4.9.2-10
ii  libgl1-mesa-glx [libgl1]  10.3.2-1
ii  libglc0   0.7.2-5+b2
ii  libglew1.10   1.10.0-3
ii  libglu1-mesa [libglu1]9.0.0-2
ii  libminiupnpc101.9.20140610-2
ii  libogg0   1.3.2-1
ii  libopenal11:1.15.1-5
ii  libphysfs12.0.3-2
ii  libpng12-01.2.50-2+b2
ii  libqt4-network4:4.8.6+git64-g5dc8b2b+dfsg-3+deb8u1
ii  libqt4-script 4:4.8.6+git64-g5dc8b2b+dfsg-3+deb8u1
ii  libqtcore44:4.8.6+git64-g5dc8b2b+dfsg-3+deb8u1
ii  libsdl1.2debian   1.2.15-10+b1
ii  libstdc++64.9.2-10
ii  libtheora01.1.1+dfsg.1-6
ii  libvorbis0a   1.3.4-2
ii  libvorbisfile31.3.4-2
ii  libx11-6  2:1.6.2-3
ii  libxrandr22:1.4.2-1+b1
ii  warzone2100-data  3.1.1-1
ii  zlib1g1:1.2.8.dfsg-2+b1

Versions of packages warzone2100 recommends:
ii  warzone2100-music  3.1.1-1

warzone2100 suggests no packages.

-- no debconf information

Bug#741075: user-mode-linux: Occasional memory corruption on startup under high load

2014-08-29 Thread Anton Ivanov


On 09/03/14 21:35, Mattia Dongili wrote:

On Sat, Mar 08, 2014 at 07:04:56AM +, Anton Ivanov wrote:

Package: user-mode-linux
Version: 3.2-2um-1+deb7u2+b1
Severity: grave
Tags: patch
Justification: causes non-serious data loss

Dear Maintainer,

This bug is perennial. If we go through old bugs with
cannot reproduce tag 50% of them are this one, the other
50% are the you should not use pipe for interprocess IPC
which we will submit shortly.

Manifestation of the problem - UML dies on startup for no
reason with a memory corruption message. Occurs only on
heavily loaded systems and usually when running a lot of
UMLs.

Thanks for the patch.
I have noticed that you submitted these patch-set (together with the
other two you sent here and more) upstream and they will be in the
stable branch.
The easiest path here is also to go through the stable release of
linux-source where uml is built from. I'll keep an eye on the stable
tree but it'd be very helpful if you could add the stable tree commit
ids once the patches get included. Same story for the other two bugs.


All 3 bugs have now patches submitted upstream. I have submitted our 
other improvements as well.


While they do not make a speed daemon of uml userspace they get it 
reasonably close to kvm. Kernel itself is now faster than qemu-kvm for 
most networking stuff.


A.



Thanks!



--
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Bug#741075: user-mode-linux: Occasional memory corruption on startup under high load

2014-03-09 Thread Anton Ivanov

On 09/03/14 21:35, Mattia Dongili wrote:
 On Sat, Mar 08, 2014 at 07:04:56AM +, Anton Ivanov wrote:
 Package: user-mode-linux
 Version: 3.2-2um-1+deb7u2+b1
 Severity: grave
 Tags: patch
 Justification: causes non-serious data loss

 Dear Maintainer,

 This bug is perennial. If we go through old bugs with
 cannot reproduce tag 50% of them are this one, the other
 50% are the you should not use pipe for interprocess IPC 
 which we will submit shortly.

 Manifestation of the problem - UML dies on startup for no
 reason with a memory corruption message. Occurs only on 
 heavily loaded systems and usually when running a lot of 
 UMLs.
 Thanks for the patch.
 I have noticed that you submitted these patch-set (together with the
 other two you sent here and more) upstream and they will be in the
 stable branch.
 The easiest path here is also to go through the stable release of
 linux-source where uml is built from. I'll keep an eye on the stable
 tree but it'd be very helpful if you could add the stable tree commit
 ids once the patches get included. Same story for the other two bugs.

 Thanks!

You are welcome.

I will update them once Richard Weinberger gets around to merge them
(hopefully soon).

-- 
If you think it's expensive to hire a professional to do the job,
wait until you hire an amateur.
Paul Neal Red Adair 

A. R. Ivanov
E-mail:  anton.iva...@kot-begemot.co.uk


-- 
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Bug#741075: user-mode-linux: Occasional memory corruption on startup under high load

2014-03-07 Thread Anton Ivanov

Package: user-mode-linux
Version: 3.2-2um-1+deb7u2+b1
Severity: grave
Tags: patch
Justification: causes non-serious data loss

Dear Maintainer,

This bug is perennial. If we go through old bugs with
cannot reproduce tag 50% of them are this one, the other
50% are the you should not use pipe for interprocess IPC 
which we will submit shortly.

Manifestation of the problem - UML dies on startup for no
reason with a memory corruption message. Occurs only on 
heavily loaded systems and usually when running a lot of 
UMLs.

-- System Information:
Debian Release: 7.3
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 3.2.0-4-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages user-mode-linux depends on:
ii  libc6  2.13-38
ii  uml-utilities  20070815-1.1-ai-1.8

user-mode-linux recommends no packages.

Versions of packages user-mode-linux suggests:
ii  gnome-terminal [x-terminal-emulator]  3.4.1.1-2
ii  konsole [x-terminal-emulator] 4:4.8.4-2
pn  rootstrap none
pn  slirp none
pn  user-mode-linux-doc   none
pn  vde2  none
ii  xfce4-terminal [x-terminal-emulator]  0.4.8-1+b1
ii  xterm [x-terminal-emulator]   278-4

-- no debconf information
From 9c3a9af21c0bfeca27eac958fde215594b4ee3fa Mon Sep 17 00:00:00 2001
From: Anton Ivanov antiv...@cisco.com
Date: Sat, 8 Mar 2014 06:49:27 +
Subject: [PATCH 2/3] BUG: Memory corruption on startup

The reverse case of this race (you must msync before read) is
well known. This is the not so common one.

It can be triggered only on systems which do a lot of task
switching and only at UML startup. If you are starting 200+ UMLs
~ 0.5% will always die without this fix.
---
 arch/um/include/shared/os.h |1 +
 arch/um/kernel/physmem.c|1 +
 arch/um/os-Linux/file.c |6 ++
 3 files changed, 8 insertions(+)

diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h
index 89b686c1..3c9738d 100644
--- a/arch/um/include/shared/os.h
+++ b/arch/um/include/shared/os.h
@@ -136,6 +136,7 @@ extern int os_ioctl_generic(int fd, unsigned int cmd, unsigned long arg);
 extern int os_get_ifname(int fd, char *namebuf);
 extern int os_set_slip(int fd);
 extern int os_mode_fd(int fd, int mode);
+extern int os_fsync_file(int fd);
 
 extern int os_seek_file(int fd, unsigned long long offset);
 extern int os_open_file(const char *file, struct openflags flags, int mode);
diff --git a/arch/um/kernel/physmem.c b/arch/um/kernel/physmem.c
index f116db1..30fdd5d0 100644
--- a/arch/um/kernel/physmem.c
+++ b/arch/um/kernel/physmem.c
@@ -103,6 +103,7 @@ void __init setup_physmem(unsigned long start, unsigned long reserve_end,
 	 */
 	os_seek_file(physmem_fd, __pa(__syscall_stub_start));
 	os_write_file(physmem_fd, __syscall_stub_start, PAGE_SIZE);
+	os_fsync_file(physmem_fd);
 
 	bootmap_size = init_bootmem(pfn, pfn + delta);
 	free_bootmem(__pa(reserve_end) + bootmap_size,
diff --git a/arch/um/os-Linux/file.c b/arch/um/os-Linux/file.c
index b049a63..a4f0e65 100644
--- a/arch/um/os-Linux/file.c
+++ b/arch/um/os-Linux/file.c
@@ -237,6 +237,12 @@ void os_close_file(int fd)
 {
 	close(fd);
 }
+int os_fsync_file(int fd)
+{
+	if (fsync(fd)  0) 
+	return -errno;
+	return 0;
+}
 
 int os_seek_file(int fd, unsigned long long offset)
 {
-- 
1.7.10.4

Bug#622652: alsa-driver: fails to build on powermac

2011-11-11 Thread Anton Ivanov


On 10/11/11 09:32, Anton Ivanov wrote:

On 10/11/11 08:49, Jonathan Nieder wrote:

Anton Ivanov wrote:

I can try to patch it to build some time next week. However, looking 
at the
supported kernels file in the package it may be better to go 
straight for

1.0.24 which is current alsa stable.

Any news on that?  (No problem if the answer is no. :))  By the way,
for reference, what kernel were you building against?



Apologies, I have been overloaded with other stuff for the last few 
months. I have 3 nearly free weeks until Dec will get around to look 
at this and other Mac specific bugs (I have a few more filed vs X, etc).


It fails to build because the alsa driver source has the expectation 
that pdev_archdata contains the same information as dev_archdata which 
on ppc includes references to the openfirmware tree.


Well, in 2.6.32 as shipped in squeeze that structure is blank. So rather 
unsurprisingly it ftbs. I am going to pull .38 to see how did this 
structure evolve over time.


Brgds,



--
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Bug#622652: alsa-driver: fails to build on powermac

2011-11-10 Thread Anton Ivanov


On 10/11/11 08:49, Jonathan Nieder wrote:

Anton Ivanov wrote:

   

I can try to patch it to build some time next week. However, looking at the
supported kernels file in the package it may be better to go straight for
1.0.24 which is current alsa stable.
 

Any news on that?  (No problem if the answer is no. :))  By the way,
for reference, what kernel were you building against?

   


Apologies, I have been overloaded with other stuff for the last few 
months. I have 3 nearly free weeks until Dec will get around to look 
at this and other Mac specific bugs (I have a few more filed vs X, etc).


--
If you think it's expensive to hire a professional to do the job,
wait until you hire an amateur.
Paul Neal Red Adair

A. R. Ivanov
E-mail:  anton.iva...@kot-begemot.co.uk




--
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Bug#622652: [Pkg-alsa-devel] Bug#622652: alsa-driver: fails to build on powermac

2011-04-14 Thread Anton Ivanov


On 04/14/11 18:27, Elimar Riesebieter wrote:

* Anton Ivanov [110413 17:57 +0100]:
   

Package: alsa-driver
 

Which version?

   

Severity: serious
Justification: fails to build from source (but built successfully in the past)
 

Elimar

   

Standard squeeze one.

 1.0.23+dfsg-2

This is a ppc only problem. Unless I am mistaken, that part of the build 
is not invoked on other platforms.


I can try to patch it to build some time next week. However, looking at 
the supported kernels file in the package it may be better to go 
straight for 1.0.24 which is current alsa stable.


Brgds,

--
   Understanding is a three-edged sword:
your side, their side, and the truth. --Kosh Naranek

A. R. Ivanov
E-mail:  aiva...@sigsegv.cx
WWW: http://www.sigsegv.cx/
pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanovai...@sigsegv.cx
Fingerprint: C824 CBD7 EE4B D7F8 5331  89D5 FCDA 572E DDE5 E715






--
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Bug#622652: Acknowledgement (alsa-driver: fails to build on powermac)

2011-04-14 Thread Anton Ivanov


Some digging points to this:

http://permalink.gmane.org/gmane.linux.kernel.commits.head/226657

as a likely culprit.

Brgds,



--
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Bug#622652: alsa-driver: fails to build on powermac

2011-04-13 Thread Anton Ivanov

Package: alsa-driver
Severity: serious
Justification: fails to build from source (but built successfully in the past)


I am still getting from time to time (not always) sound glitches similar
to the ones reported in: Bug#610859 so I decided to try building more 
recent alsa from source. However it does not build.

In file included from /usr/src/modules/alsa-driver/ppc/pmac.c:13:   
/usr/src/modules/alsa-driver/ppc/../alsa-kernel/ppc/pmac.c: In function 
‘detect_byte_swap’: 
/usr/src/modules/alsa-driver/ppc/../alsa-kernel/ppc/pmac.c:925: error:  
implicit declaration of function ‘of_machine_is_compatible’ 
make[7]: *** [/usr/src/modules/alsa-driver/ppc/pmac.o] Error 1  
make[6]: *** [/usr/src/modules/alsa-driver/ppc] Error 2 
make[5]: *** [_module_/usr/src/modules/alsa-driver] Error 2 
make[4]: *** [sub-make] Error 2 
make[3]: *** [all] Error 2  
make[3]: Leaving directory `/usr/src/linux-headers-2.6.32-5-powerpc'
make[2]: *** [compile] Error 2  
make[2]: Leaving directory `/usr/src/modules/alsa-driver'   
make[1]: *** [build-stamp] Error 2  
make[1]: Leaving directory `/usr/src/modules/alsa-driver'   

This is the tail of m-a a-i alsa

-- System Information:
Debian Release: 6.0
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: powerpc (ppc)

Kernel: Linux 2.6.32-5-powerpc
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash



-- 
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Bug#520095: removes the toplevel mountpoint directories and fails to start the next time

2009-08-02 Thread Anton Ivanov

On Sun, 2009-08-02 at 01:40 +0200, Jan Christoph Nordholz wrote:
 Hi Michael,
 
  No idea.  If I were knew, I'd attach a patch for this issue.
  The code is quite.. funny and fragile, I tried to understand
  it right before submitting a bugreport but that wasn't quite
  successful.
 
  I ran it under strace - pure automountd, without any startu
  scripts but with the same args.  It never ever tried to mkdir
  or rename.  It created two random dirs in /tmp, mounted a
  tmpfs over one of them (running mount(8)), bind-mounted it
  on second dir, next did stat(/misc) (which returned ENOENT)
  and immediately gave up returning it can't mount /misc.
 
 this is the strace log on my system after the spawned umount
 process has terminated:
 
 ] 30451 --- SIGCHLD (Child exited) @ 0 (0) ---
 ] 30451 rmdir(/tmp/autoa1Aqlv)  = 0
 ] 30451 rmdir(/tmp/autohY6Rkm)  = 0
 ] 30451 rt_sigaction(SIGTERM, {0xb801dd70, [HUP USR1 USR2 ALRM TERM], 
 SA_RESTART}, NULL, 8) = 0
 ] 30451 rt_sigaction( several more )
 ] 30451 open(/etc/mtab, O_RDONLY)   = 8
 ] 30451 fstat64(8, {st_mode=S_IFREG|0644, st_size=701, ...}) = 0
 ] 30451 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 
 -1, 0) = 0xb7fe7000
 ] 30451 read(8, /dev/sda2 / ext3 rw,errors=remou..., 4096) = 701
 ] 30451 read(8, , 4096) = 0
 ] 30451 close(8)  = 0
 ] 30451 munmap(0xb7fe7000, 4096)  = 0
 ] 30451 stat64(/misc, 0xbfbca894)   = -1 ENOENT (No such file or 
 directory)
 ] 30451 open(/etc/mtab, O_RDONLY)   = 8
 ] 30451 fstat64(8, {st_mode=S_IFREG|0644, st_size=701, ...}) = 0
 ] 30451 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 
 -1, 0) = 0xb7fe7000
 ] 30451 read(8, /dev/sda2 / ext3 rw,errors=remou..., 4096) = 701
 ] 30451 read(8, , 4096) = 0
 ] 30451 close(8)  = 0
 ] 30451 munmap(0xb7fe7000, 4096)  = 0
 ] 30451 statfs(/, {f_type=EXT2_SUPER_MAGIC, f_bsize=4096, 
 f_blocks=9612195, f_bfree=5459645, f_bavail=4971364, f_files=2444624, 
 f_ffree=2142
 ] 30451 mkdir(/misc, 0555)  = 0
 ] 30451 pipe([8, 11]) = 0
 ] 30451 pipe([12, 13])= 0
 ] 30451 rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
 ] 30451 pipe([14, 15])= 0
 ] 30451 clone(child_stack=0, 
 flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
 child_tidptr=0xb7e7fb48) = 30454
 ] [...] which calls:
 ] 30454 execve(/bin/mount, [/bin/mount, -t, autofs, -o, 
 fd=11,pgrp=30451,minproto=2,maxp..., automount(pid30451), /misc], [/* 
 44 vars */]) = 0
 
 Maybe you can spot the difference that's causing your automountd to
 give up - but I'd suggest switching to v5 anyway because upstream
 development on v4 has ceased, and I'd like to drop v4 before Squeeze
 is released.

Can I propose a simple workaround until v5 is out. Once upon a time the
automount init.d script used to create the dirs. What exactly is the
problem in doing this once again?

It is a one-liner after all. 

Brgds,

 
 
 Regards,
 
 Jan
-- 
   Understanding is a three-edged sword:
your side, their side, and the truth. --Kosh Naranek

A. R. Ivanov
E-mail:  aiva...@sigsegv.cx
WWW: http://www.sigsegv.cx/
pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov ariva...@sigsegv.cx
Fingerprint: C824 CBD7 EE4B D7F8 5331  89D5 FCDA 572E DDE5 E715





-- 
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Bug#520095: autofs: 100% reproducible on NFS root for the last two releases

2009-04-02 Thread Anton Ivanov

Package: autofs
Version: 4.1.4+debian-2.1
Followup-For: Bug #520095


I had the same problem on NFS root with Sarge and it still exists in Lenny. 

Prior to Sarge the autofs init script was checking if the mountpoint dirs
exist and if not - creating them. Without this it is broken on NFS root 
systems (100% reproducible).

-- System Information:
Debian Release: 5.0
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: i386 (i686)

Kernel: Linux 2.6.26-1-686 (SMP w/2 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages autofs depends on:
ii  libc6 2.7-18 GNU C Library: Shared libraries
ii  ucf   3.0016 Update Configuration File: preserv

Versions of packages autofs recommends:
ii  module-init-tools3.4-1   tools for managing Linux kernel mo
ii  nfs-common   1:1.1.2-6lenny1 NFS support files common to client

autofs suggests no packages.

-- no debconf information



-- 
To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Bug#403915: RIPD loses sanity after changing a large chunk of iptable rules

2006-12-20 Thread Anton Ivanov

Package: quagga
Version: 0.98.3-7.2
Severity: serious

This is observed only on one of several of our firewall systems (not the
most loaded and most complex ones). They have 1000+ iptable rules
generated by scripts and after reloading them ripd goes south. The
process is still running but it does not generate any further updates. 

The vtysh interface shows all relevant RIP commands and a correct RIP
configuration. Nothing obvious in the log so far.

I will try to build the version from testing and test it after the 27th
of December to see if it suffers from the same bug.

-- System Information:
Debian Release: 3.1
Architecture: i386 (i686)
Kernel: Linux 2.6.14-1-686
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)

Versions of packages quagga depends on:
ii  iproute   20041019-3 Professional tools to control the 
ii  libc6 2.3.2.ds1-22sarge4 GNU C Library: Shared libraries an
ii  libcap1   1:1.10-14  support for getting/setting POSIX.
ii  libncurses5   5.4-4  Shared libraries for terminal hand
ii  libpam0g  0.76-22Pluggable Authentication Modules l
ii  libreadline4  4.3-11 GNU readline and history libraries
ii  logrotate 3.7-5  Log rotation utility

-- no debconf information


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]

Bug#375967: Segfaults

2006-06-29 Thread Anton Ivanov

Package: fam
Version: 2.7.0-6sarge1
Severity: serious


fam segfaults when running on a heavily loaded server. The machine in
question is an imap server running courier (with fam support) and an NFS
server as well (circa 100 users). When started as /usr/sbin/famd -T 0 it will 
exit after a
few minutes. 
Running it in foreground with -v does not produce anything reasonable.
You see multiple messages about clients closing connections and a
segfault at the end.

-- System Information:
Debian Release: 3.1
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)
Kernel: Linux 2.6.10-1-k8-smp
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)

Versions of packages fam depends on:
ii  libc6   2.3.2.ds1-22 GNU C Library: Shared libraries an
ii  libgcc1 1:3.4.3-13   GCC support library
ii  libstdc++5  1:3.3.5-13   The GNU Standard C++ Library v3
ii  portmap 5-9  The RPC portmapper

-- no debconf information


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]

Bug#375967: Acknowledgement (Segfaults)

2006-06-29 Thread Anton Ivanov

I tried to run it under gdb without success. It gets a signal 33 after a
while which I think is actually a GDB artefact.

Any ideas on how to debug this will be appreciated.

-- 
   Understanding is a three-edged sword:
 your side, their side, and the truth. --Kosh Naranek

A. R. Ivanov
E-mail:  [EMAIL PROTECTED]
WWW: http://www.sigsegv.cx/
pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov [EMAIL PROTECTED]
Fingerprint: C824 CBD7 EE4B D7F8 5331  89D5 FCDA 572E DDE5 E715





-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]

Bug#375967: Acknowledgement (Segfaults)

2006-06-29 Thread Anton Ivanov

Thomas Girard wrote:

Selon Anton Ivanov [EMAIL PROTECTED]:

  

I tried to run it under gdb without success. It gets a signal 33 after a
while which I think is actually a GDB artefact.

Any ideas on how to debug this will be appreciated.



If this is *really* an artefact you can use

handle SIG33 noprint nostop.
  

famd: NetConnection.c++:252: void NetConnection::flush(): Assertion `ret
== omsgList-len' failed.

Program received signal SIGABRT, Aborted.
0xb7def83b in raise () from /lib/tls/libc.so.6

-- 
   Understanding is a three-edged sword:
 your side, their side, and the truth. --Kosh Naranek

A. R. Ivanov
E-mail:  [EMAIL PROTECTED]
WWW: http://www.sigsegv.cx/
pub 1024D/DDE5E715 2002-03-03 Anton R. Ivanov [EMAIL PROTECTED]
Fingerprint: C824 CBD7 EE4B D7F8 5331  89D5 FCDA 572E DDE5 E715





-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]

Bug#375967: Acknowledgement (Segfaults)

2006-06-29 Thread Anton Ivanov

Thomas Girard wrote:

Selon Anton Ivanov [EMAIL PROTECTED]:

  

famd: NetConnection.c++:252: void NetConnection::flush(): Assertion `ret
== omsgList-len' failed.

Program received signal SIGABRT, Aborted.
0xb7def83b in raise () from /lib/tls/libc.so.6



Great.  And what does `bt full' give you then ?
  


Program received signal SIGABRT, Aborted.
0xb7def83b in raise () from /lib/tls/libc.so.6
(gdb) bt full
#0  0xb7def83b in raise () from /lib/tls/libc.so.6
No symbol table info available.
#1  0xb7df0fa2 in abort () from /lib/tls/libc.so.6
No symbol table info available.
#2  0xb7de92df in __assert_fail () from /lib/tls/libc.so.6
No symbol table info available.
#3  0x080559aa in NetConnection::flush (this=0x80c684c) at
NetConnection.c++:263
ret = 0
#4  0x08055853 in NetConnection::mprintf (this=0xb7ef8e80, format=0x0)
at NetConnection.c++:236
msg = (NetConnection::msgList_s *) 0x80f5d88
#5  0x0804a321 in ClientConnection::send_event (this=0x80c684c,
[EMAIL PROTECTED], request=409,
name=0x80f5c40 X-Debian-Apps-Net-komba2.desktop) at
ClientConnection.c++:54
code = 0 '\0'
#6  0x0805c012 in TCP_Client::post_event (this=0x80c6820,
[EMAIL PROTECTED], request=409,
path=0x80f5c40 X-Debian-Apps-Net-komba2.desktop) at TCP_Client.c++:312
No locals.
#7  0x0804a9fb in ClientInterest::post_event (this=0x80edcc0, [EMAIL PROTECTED],
eventpath=0x80f5c40 X-Debian-Apps-Net-komba2.desktop) at
ClientInterest.c++:131
No locals.
#8  0x0804bce5 in DirEntry::post_event (this=0x80f5c98, [EMAIL PROTECTED],
eventpath=0x0) at Interest.h:61
No locals.
#9  0x0804cdc6 in DirectoryScanner::done (this=0x8076560) at
DirectoryScanner.c++:149
dp = (dirent *) 0x0
ep = (class DirEntry *) 0x80f5c98
epp2 = (class DirEntry **) 0x6
ready = true
#10 0x0804c0f4 in Directory (this=0x80edcc0, name=0x0, c=0x0, r=0,
[EMAIL PROTECTED]) at Directory.c++:54
No locals.
---Type return to continue, or q return to quit---
#11 0x0805342c in MxClient::monitor_dir (this=0x80c6820, request=409,
path=0xbfffe6d0
/exports/systems-team-home/mf2/.local/share/applications/menu-xdg,
[EMAIL PROTECTED])
at MxClient.c++:92
ip = (class ClientInterest *) 0xbfffe668
#12 0x0805bb5f in TCP_Client::input_msg (this=0x80c6820, msg=0xbfffe6c0
À\\\f\b, size=104)
at TCP_Client.c++:198
i = 6
grouplist = (gid_t *) 0x80edca0
ngroups = -1073748272
c = {static SuperUser = {static SuperUser = same as static
member of an already seen type,
p = 0x806e008, static untrusted = {static SuperUser = same as
static member of an already seen type,
  p = 0x8073ca0, static untrusted = same as static member of an
already seen type,
  static insecure_compat = false, static impllist = 0x80727e0,
static nimpl = 16,
  static nimpl_alloc = 22}, static insecure_compat = false, static
impllist = 0x80727e0,
static nimpl = 16, static nimpl_alloc = 22}, p = 0x80c5cc0,
  static untrusted = same as static member of an already seen type,
static insecure_compat = false,
  static impllist = 0x80727e0, static nimpl = 16, static nimpl_alloc = 22}
got_N_with_groups = false
p = 0x80c6b49 
q = 0x80c6b49 
opcode = 77 'M'
reqnum = 409
uid = 1039
gid = 1039
filename =
/exports/systems-team-home/mf2/.local/share/applications/menu-xdg\000\000/xdg-applications/Windows+Applications/Programs/FirstClass\000\000martSketch\000\000dia+Browser\000\000P\221þ·\000\000\000\000µØ\006\000\000èÿ¿ètò·¤çÿ¿\000\020\000\000X.Ý·ètò·\000\000\000\000\2256Ê\006
èÿ¿...
i = 6
---Type return to continue, or q return to quit---
msg_cred = {static SuperUser = {static SuperUser = same as
static member of an already seen type,
p = 0x806e008, static untrusted = {static SuperUser = same as
static member of an already seen type,
  p = 0x8073ca0, static untrusted = same as static member of an
already seen type,
  static insecure_compat = false, static impllist = 0x80727e0,
static nimpl = 16,
  static nimpl_alloc = 22}, static insecure_compat = false, static
impllist = 0x80727e0,
static nimpl = 16, static nimpl_alloc = 22}, p = 0x80c5cc0,
  static untrusted = same as static member of an already seen type,
static insecure_compat = false,
  static impllist = 0x80727e0, static nimpl = 16, static nimpl_alloc = 22}
#13 0x0805b766 in TCP_Client::input_handler (msg=0x6 Address 0x6 out of
bounds, nbytes=0,
closure=0x80c6820) at TCP_Client.c++:69
No locals.
#14 0x0804a2c6 in ClientConnection::input_msg (this=0x6, msg=0x0,
nbytes=0) at ClientConnection.c++:40
No locals.
#15 0x08055692 in NetConnection::deliver_input (this=0x80c684c) at
NetConnection.c++:170
ihead = 0x80c6ade 
remaining = 135031626
#16 0x08059354 in Scheduler::handle_io (fds=0xb840,
iotype=Scheduler::FDInfo::read) at Scheduler.c++:315
fp = (Scheduler::FDInfo *) 0x6
fd = 364
#17 0x08059431 in Scheduler::select () at Scheduler.c

Bug#308792: After the last update Via C3 systems give assertion failed in ld.so at boot

2005-05-12 Thread Anton Ivanov

Package: libc6
Version: 2.3.2.ds1-21
Severity: critical



After the last update C3 Version 1

with a kernel 2.6 image will fail on boot with:

Inconsistency detected by ld.so: do_rel.h: 109 elf_dynamic_do_rel: Assertion 
'(map-l_info[(34+0+(0x6ff - (0x6ff0)))] != ((void *0))' failed!

Tested with the following 2.6 images: 
older 2.6.10-1-386 (subversion 2), 
current 2.6.10-1-386 (subversion 10)
whatever the debian installer tries to put on the system when booted 
with 2.6 - most likely 
current 2.6.10-1-386 
2.6.9 (686 config altered to optimize for 386).

with the default 2.4.18 image from woody will boot normally

with 2.6.10-2 and 2.6.9 will boot normally if the libc6 is 2.3.2.ds1-20 or 
earlier.

I am looking at the changelog for ds1-21 and so far I have no idea what could 
have caused it.

-- System Information:
Debian Release: 3.1
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)
Kernel: Linux 2.4.18-bf2.4
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)

Versions of packages libc6 depends on:
ii  libdb1-compat 2.1.3-7The Berkeley database routines [gl

-- no debconf information


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]

Bug#308792: After the last update Via C3 systems give assertion failed in ld.so at boot

2005-05-12 Thread Anton Ivanov

Daniel Jacobowitz wrote:
On Thu, May 12, 2005 at 11:39:23AM +, Anton Ivanov wrote:

Package: libc6
Version: 2.3.2.ds1-21
Severity: critical

After the last update C3 Version 1
with a kernel 2.6 image will fail on boot with:
Inconsistency detected by ld.so: do_rel.h: 109 elf_dynamic_do_rel: Assertion
'(map-l_info[(34+0+(0x6ff - (0x6ff0)))] != ((void *0))' failed!

Tested with the following 2.6 images:
older 2.6.10-1-386 (subversion 2),
current 2.6.10-1-386 (subversion 10)
whatever the debian installer tries to put on the system when booted with 2.6 - most likely
current 2.6.10-1-386
2.6.9 (686 config altered to optimize for 386).

with the default 2.4.18 image from woody will boot normally
with 2.6.10-2 and 2.6.9 will boot normally if the libc6 is 2.3.2.ds1-20 or
earlier.
I am looking at the changelog for ds1-21 and so far I have no idea what could have caused it.

That's:
#ifdef RTLD_BOOTSTRAP
/* The dynamic linker always uses versioning. */
assert (map-l_info[VERSYMIDX (DT_VERSYM)] != NULL);
#else
The problem is not going to be anywhere near there. That is a check on
the ld.so binary, which works elsewhere. Probably your mmap is busted.
I do not see anything that would cause this in -21 either.

I tested with a few more versions and some alternative hardware.
2.4.27 also does not boot unless you turn off ACPI and APIC. If you turn
them off it boots.

All 2.6 images and 2.4.27 boot OK on C3 V2.
It starts looking like an an interaction of specific hardware and
drivers - C3 V1, CMD649 ide and a few others. I still have no idea why
does it bomb out in mmap/ldso. With a hardware problem I would have
expected it to barf much earlier and in a more consistent manner.

A.
--
La Châtelier's Law:

If some stress is brought to bear on a system in equilibrium,
the equilibrium is displaced in the direction which tends to undo the
effect of the stress.

53 matches

Mail list logo