Re: new rust
>>for i in 0..u64::MAX { >>match libc::_cpuset_isset(i, set) { >> [...] >> but ... under which conditions would it seg-fault inside that >> function? > > What's does the Rust impl. of _cpuset_isset() look like? Does it > take ints by any chance and you're passing a u64 to it here. A C > compiler will complain if you use `-m32', but, that's all. Don't > know how the Rust FFI will handle this. That's all I can think > of... The relevant rust definitions were (from vendor/libc/src/unix/bsd/netbsdlike/netbsd/mod.rs): pub type cpuid_t = u64; extern "C" { pub fn _cpuset_isset(cpu: cpuid_t, set: *const cpuset_t) -> ::c_int; } Of these, the cpuid_t was wrong, because in C it is typedef unsigned long cpuid_t; (from ), and that's a 32-bit type on ILP32 ports. On such systems, seen from the 32-bit "actual" libc side, this would cause rust to do the equivalent of _cpuset_isset(0, NULL), which is of course going to cause an immediate NULL pointer de-reference. This is now all on the way to be fixed, since this pull request has been accepted and applied upstream: https://github.com/rust-lang/libc/pull/3386 and I've applied this patch to the various "rust libc*" versions vendored inside rust, and have re-built the 1.72.1 bits with this fix as well. >> Debugging the C program reveals that pthread_getaffinity_np() has >> done exactly nothing to the "cset" contents as near as I can >> tell, the "bits" entry doesn't change. > > pthread_getaffinity_np() _can_ be used to get the no. of "online" > CPUs on both Linux and FreeBSD, but it looks (from my perusal just > now) like threads default to no affinity on NetBSD and the scheduler > just picks whatever CPUs available for it--unless the affinity is > explicitly set, in which case it's inherited. > > I think you should just use sysconf(_SC_NPROCESSORS_ONLN) or the > equivalent on NetBSD. That threads default to no affinity on NetBSD matches what I'm seeing and hearing. However, the affinity set *can* be tweaked by schedctl (which appears to require root privileges). The fallback code in rust already does as you suggest: if the probe for the number of CPUs the thread has affinity to is 0, the code probes for _SC_NPROCESSORS_ONLN, and if that returns < 1, then probes for HW_NCPU. Regards, - HÃ¥vard
Re: new rust (was: gdb issues?)
On Wed, 11 Oct 2023, Havard Eidnes wrote: Program terminated with signal SIGSEGV, Segmentation fault. ... #0 0x60d0fe74 in _cpuset_isset () from /usr/lib/libc.so.12 #1 0x03d2bf8c in std::sys::unix::thread::available_parallelism () ... At least it gives a bit of clue about where to go looking for the null pointer de-reference, so that's at least something... This gets me to work/rustc-1.73.0-src/library/std/src/sys/unix/thread.rs which says: for i in 0..u64::MAX { match libc::_cpuset_isset(i, set) { [...] but ... under which conditions would it seg-fault inside that function? What's does the Rust impl. of _cpuset_isset() look like? Does it take ints by any chance and you're passing a u64 to it here. A C compiler will complain if you use `-m32', but, that's all. Don't know how the Rust FFI will handle this. That's all I can think of... Debugging the C program reveals that pthread_getaffinity_np() has done exactly nothing to the "cset" contents as near as I can tell, the "bits" entry doesn't change. pthread_getaffinity_np() _can_ be used to get the no. of "online" CPUs on both Linux and FreeBSD, but it looks (from my perusal just now) like threads default to no affinity on NetBSD and the scheduler just picks whatever CPUs available for it--unless the affinity is explicitly set, in which case it's inherited. I think you should just use sysconf(_SC_NPROCESSORS_ONLN) or the equivalent on NetBSD. HTH, -RVP
Re: new rust (was: gdb issues?)
> Program terminated with signal SIGSEGV, Segmentation fault. ... > #0 0x60d0fe74 in _cpuset_isset () from /usr/lib/libc.so.12 > #1 0x03d2bf8c in std::sys::unix::thread::available_parallelism () ... > At least it gives a bit of clue about where to go looking for the > null pointer de-reference, so that's at least something... This gets me to work/rustc-1.73.0-src/library/std/src/sys/unix/thread.rs which says: #[cfg(target_os = "netbsd")] { unsafe { let set = libc::_cpuset_create(); if !set.is_null() { let mut count: usize = 0; if libc::pthread_getaffinity_np(libc::pthread_self(), libc::_cpuset_size(set), set) == 0 { for i in 0..u64::MAX { match libc::_cpuset_isset(i, set) { -1 => break, 0 => continue, _ => count = count + 1, } } } libc::_cpuset_destroy(set); if let Some(count) = NonZeroUsize::new(count) { return Ok(count); } } } } which on the surface looks innocent enough, and this is as near as I can tell the same code as in rust 1.72.1, while the code in 1.71.1 is different, and falls back to using sysctl with this code (the bootstrap program may be linked with the "old" standard library, so the problem may have been in 1.72.1 too): let mut cpus: libc::c_uint = 0; let mut cpus_size = crate::mem::size_of_val(); unsafe { cpus = libc::sysconf(libc::_SC_NPROCESSORS_ONLN) as libc::c_uint; } // Fallback approach in case of errors or no hardware threads. if cpus < 1 { let mut mib = [libc::CTL_HW, libc::HW_NCPU, 0, 0]; let res = unsafe { libc::sysctl( mib.as_mut_ptr(), 2, cpus as *mut _ as *mut _, cpus_size as *mut _ as *mut _, ptr::null_mut(), 0, ) }; // Handle errors if any. if res == -1 { return Err(io::Error::last_os_error()); } else if cpus == 0 { return Err(io::const_io_error!(io::ErrorKind::NotFound, "The number of hardware threads is not known for the target platform")); } } Ok(unsafe { NonZeroUsize::new_unchecked(cpus as usize) }) (Actually, the fallback code is there in 1.73.0 and 1.72.1 too, it's just not used due to the addition of the netbsd-specific section above...) The cpuset(3) man page says cpuset_isset(cpu, set) Checks if CPU specified by cpu is set in the CPU-set set. Returns the positive number if set, zero if not set, and -1 if cpu is invalid. but ... under which conditions would it seg-fault inside that function? Looking at the C code in common doesn't reveal anything frightening... However, an attempt at a trivial re-implementation "to count CPUs" in this manner in C does not trigger this issue on any of my "problematic" platforms (or on amd64 for that matter): #include #include #include int main(int argc, char **argv) { int count = 0; cpuset_t *cset; int i; int ret; cset = cpuset_create(); if (cset != NULL) { cpuset_zero(cset); if (pthread_getaffinity_np(pthread_self(), cpuset_size(cset), cset) == 0) { for (i = 0; i<256; i++) { ret = cpuset_isset(i, cset); if (ret == -1) break; if (ret == 0) continue; count++; } } } printf("cpus: %d\n", count); return 0; } but also fails to count the number of CPUs (prints 0). So what am I (and/or rust) doing wrong? Or ... is this code simply wrong anyway, and we need to re-instate the 1.71.1 code path by ripping out the NetBSD-specific section quoted above? Meanwhile, the warning in the pthread_getaffinity_np man page is ignored: Portable applications should not use the pthread_setaffinity_np() and pthread_getaffinity_np() functions. Although it could perhaps be argued that rust isn't all that portable..., and perhaps in particular this piece of code? Debugging the C program