Re: new rust

2023-10-16 Thread Havard Eidnes
>>for i in 0..u64::MAX {
>>match libc::_cpuset_isset(i, set) {
>> [...]
>> but ... under which conditions would it seg-fault inside that
>> function?
>
> What's does the Rust impl. of _cpuset_isset() look like? Does it
> take ints by any chance and you're passing a u64 to it here. A C
> compiler will complain if you use `-m32', but, that's all. Don't
> know how the Rust FFI will handle this. That's all I can think
> of...

The relevant rust definitions were (from
vendor/libc/src/unix/bsd/netbsdlike/netbsd/mod.rs):

pub type cpuid_t = u64;

extern "C" {
pub fn _cpuset_isset(cpu: cpuid_t, set: *const cpuset_t) -> ::c_int;
}

Of these, the cpuid_t was wrong, because in C it is

typedef unsigned long   cpuid_t;

(from ), and that's a 32-bit type on ILP32 ports.
On such systems, seen from the 32-bit "actual" libc side, this
would cause rust to do the equivalent of _cpuset_isset(0, NULL),
which is of course going to cause an immediate NULL pointer
de-reference.

This is now all on the way to be fixed, since this pull request has
been accepted and applied upstream:

  https://github.com/rust-lang/libc/pull/3386

and I've applied this patch to the various "rust libc*" versions
vendored inside rust, and have re-built the 1.72.1 bits with this
fix as well.

>> Debugging the C program reveals that pthread_getaffinity_np() has
>> done exactly nothing to the "cset" contents as near as I can
>> tell, the "bits" entry doesn't change.
>
> pthread_getaffinity_np() _can_ be used to get the no. of "online"
> CPUs on both Linux and FreeBSD, but it looks (from my perusal just
> now) like threads default to no affinity on NetBSD and the scheduler
> just picks whatever CPUs available for it--unless the affinity is
> explicitly set, in which case it's inherited.
>
> I think you should just use sysconf(_SC_NPROCESSORS_ONLN) or the
> equivalent on NetBSD.

That threads default to no affinity on NetBSD matches what I'm
seeing and hearing.  However, the affinity set *can* be tweaked
by schedctl (which appears to require root privileges).

The fallback code in rust already does as you suggest: if the
probe for the number of CPUs the thread has affinity to is 0, the
code probes for _SC_NPROCESSORS_ONLN, and if that returns < 1,
then probes for HW_NCPU.

Regards,

- HÃ¥vard


Re: new rust (was: gdb issues?)

2023-10-15 Thread RVP

On Wed, 11 Oct 2023, Havard Eidnes wrote:


Program terminated with signal SIGSEGV, Segmentation fault.

...

#0  0x60d0fe74 in _cpuset_isset () from /usr/lib/libc.so.12
#1  0x03d2bf8c in std::sys::unix::thread::available_parallelism ()


...


At least it gives a bit of clue about where to go looking for the
null pointer de-reference, so that's at least something...


This gets me to

work/rustc-1.73.0-src/library/std/src/sys/unix/thread.rs

which says:

   for i in 0..u64::MAX {
   match libc::_cpuset_isset(i, set) {
[...]
but ... under which conditions would it seg-fault inside that function?



What's does the Rust impl. of _cpuset_isset() look like? Does it
take ints by any chance and you're passing a u64 to it here. A C
compiler will complain if you use `-m32', but, that's all. Don't
know how the Rust FFI will handle this. That's all I can think
of...


Debugging the C program reveals that pthread_getaffinity_np() has
done exactly nothing to the "cset" contents as near as I can
tell, the "bits" entry doesn't change.



pthread_getaffinity_np() _can_ be used to get the no. of "online"
CPUs on both Linux and FreeBSD, but it looks (from my perusal just
now) like threads default to no affinity on NetBSD and the scheduler
just picks whatever CPUs available for it--unless the affinity is
explicitly set, in which case it's inherited.

I think you should just use sysconf(_SC_NPROCESSORS_ONLN) or the
equivalent on NetBSD.

HTH,

-RVP


Re: new rust (was: gdb issues?)

2023-10-11 Thread Havard Eidnes
> Program terminated with signal SIGSEGV, Segmentation fault.
...
> #0  0x60d0fe74 in _cpuset_isset () from /usr/lib/libc.so.12
> #1  0x03d2bf8c in std::sys::unix::thread::available_parallelism ()

...

> At least it gives a bit of clue about where to go looking for the
> null pointer de-reference, so that's at least something...

This gets me to

work/rustc-1.73.0-src/library/std/src/sys/unix/thread.rs

which says:

#[cfg(target_os = "netbsd")]
{
unsafe {
let set = libc::_cpuset_create();
if !set.is_null() {
let mut count: usize = 0;
if libc::pthread_getaffinity_np(libc::pthread_self(), 
libc::_cpuset_size(set), set) == 0 {
for i in 0..u64::MAX {
match libc::_cpuset_isset(i, set) {
-1 => break,
0 => continue,
_ => count = count + 1,
}
}
}
libc::_cpuset_destroy(set);
if let Some(count) = NonZeroUsize::new(count) {
return Ok(count);
}
}
}
}

which on the surface looks innocent enough, and this is as near
as I can tell the same code as in rust 1.72.1, while the code in
1.71.1 is different, and falls back to using sysctl with this
code (the bootstrap program may be linked with the "old" standard
library, so the problem may have been in 1.72.1 too):

let mut cpus: libc::c_uint = 0;
let mut cpus_size = crate::mem::size_of_val();

unsafe {
cpus = libc::sysconf(libc::_SC_NPROCESSORS_ONLN) as 
libc::c_uint;
}

// Fallback approach in case of errors or no hardware threads.
if cpus < 1 {
let mut mib = [libc::CTL_HW, libc::HW_NCPU, 0, 0];
let res = unsafe {
libc::sysctl(
mib.as_mut_ptr(),
2,
 cpus as *mut _ as *mut _,
 cpus_size as *mut _ as *mut _,
ptr::null_mut(),
0,
)
};

// Handle errors if any.
if res == -1 {
return Err(io::Error::last_os_error());
} else if cpus == 0 {
return Err(io::const_io_error!(io::ErrorKind::NotFound, 
"The number of hardware threads is not known for the target platform"));
}
}
Ok(unsafe { NonZeroUsize::new_unchecked(cpus as usize) })

(Actually, the fallback code is there in 1.73.0 and 1.72.1 too,
it's just not used due to the addition of the netbsd-specific
section above...)

The cpuset(3) man page says

 cpuset_isset(cpu, set)
  Checks if CPU specified by cpu is set in the CPU-set set.
  Returns the positive number if set, zero if not set, and -1 if
  cpu is invalid.

but ... under which conditions would it seg-fault inside that function?
Looking at the C code in common doesn't reveal anything frightening...

However, an attempt at a trivial re-implementation "to count
CPUs" in this manner in C does not trigger this issue on any of
my "problematic" platforms (or on amd64 for that matter):

#include 
#include 
#include 

int
main(int argc, char **argv)
{
int count = 0;
cpuset_t *cset;
int i;
int ret;

cset = cpuset_create();
if (cset != NULL) {
cpuset_zero(cset);
if (pthread_getaffinity_np(pthread_self(),  
cpuset_size(cset),
cset) == 0)
{
for (i = 0; i<256; i++) {
ret = cpuset_isset(i, cset);
if (ret == -1)
break;
if (ret == 0)
continue;
count++;
}
}
}
printf("cpus: %d\n", count);
return 0;
}

but also fails to count the number of CPUs (prints 0). So what
am I (and/or rust) doing wrong?  Or ... is this code simply wrong
anyway, and we need to re-instate the 1.71.1 code path by ripping
out the NetBSD-specific section quoted above?

Meanwhile, the warning in the pthread_getaffinity_np man page is
ignored:

 Portable applications should not use the pthread_setaffinity_np() and
 pthread_getaffinity_np() functions.

Although it could perhaps be argued that rust isn't all that
portable..., and perhaps in particular this piece of code?

Debugging the C program