---------- Forwarded message ---------
From: a aa <[email protected]>
Date: Mon, Aug 11, 2025 at 11:40 PM
Subject: Re: bug#79194: Segfault with 0 byte symbol
To: Tomas Volf <[email protected]>
Hello, Tomas <[email protected]>
> You wrote "handle 0 byte *strings*" (emphasis mine). That is not what
> you are doing. NULL is not a "0 byte string". "" (almost) is. Or a
> char* can be. But not NULL.
The string "" does not have 0 bytes in utf8. It has 1 byte for the null
terminator,
which is still reading out of bounds when providing 0 for its length.
I probably should have mentioned this in the first message, but I
experienced
this issue in rust which doesn't use null terminated strings so the example
wasn't
exactly accurate.
rustc -lguile-3.0 main.rs
or if you don't have rust
c++ main.cpp `pkg-config --cflags --libs guile-3.0`
> Both work fine. You cannot just send a null pointer to a function that
> is supposed to take a string and expect it to work. So I do not think
> it is valid to declare this to be an "incorrect behaviour", maybe
> "unexpected" (by you) would be better description.
The function does not take a string, it takes an array of utf8 codepoints
so the function
should not expect the pointer to have a null terminator.
On Sun, Aug 10, 2025 at 2:34 PM Tomas Volf <[email protected]> wrote:
> Hi,
>
> a aa <[email protected]> writes:
>
> > Hello,
> >
> > how to reproduce:
> > cc main.c `pkg-config --libs --cflags guile-3.0` && ./a.out
> >
> > incorrect behaviour:
> > The second parameter for scm_from_utf8_symboln should be how many bytes
> are
> > pointed to by the pointer however the pointer still gets read if the
> length
> > is zero. Being able to handle 0 byte strings is expected since replacing
> > the call from scm_from_utf8_symboln to scm_from_utf8_stringn will not
> have
> > a segfault and the documentation for this function does not mention being
> > unable to handle 0 byte strings.
>
> Well, the scm_from_utf8_symboln is just not documented at all, so I am
> not sure how you have determined that the "documentation for this
> function does no mention ...". But let us ignore that for a moment.
>
> You wrote "handle 0 byte *strings*" (emphasis mine). That is not what
> you are doing. NULL is not a "0 byte string". "" (almost) is. Or a
> char* can be. But not NULL.
>
> >
> > [..]
> >
> > #include <libguile.h>
> >
> > void* inner_main(void*) {
> > SCM sym = scm_from_utf8_symboln(NULL, 0);
>
> The line should be
>
> SCM sym = scm_from_utf8_symboln("", 0);
>
> or
>
> const char zero_str[] = {};
> SCM sym = scm_from_utf8_symboln(zero_str, 0);
>
> Both work fine. You cannot just send a null pointer to a function that
> is supposed to take a string and expect it to work. So I do not think
> it is valid to declare this to be an "incorrect behaviour", maybe
> "unexpected" (by you) would be better description.
>
> Tomas
>
> --
> There are only two hard things in Computer Science:
> cache invalidation, naming things and off-by-one errors.
>
use std::{
ffi::{c_char, c_void},
ptr,
};
extern "C" {
fn scm_with_guile(
_: Option<unsafe extern "C" fn(_: *mut c_void) -> *mut c_void>,
_: *mut c_void,
) -> *mut c_void;
fn scm_from_utf8_symboln(_: *const c_char, _: usize) -> *mut c_void;
}
unsafe extern "C" fn inner_main(_: *mut c_void) -> *mut c_void {
let str = "";
unsafe {
scm_from_utf8_symboln(str.as_ptr().cast(), str.len());
}
ptr::null_mut()
}
fn main() {
unsafe {
scm_with_guile(Some(inner_main), ptr::null_mut());
}
}
#include <libguile.h>
#include <vector>
void* inner_main(void*) {
std::vector<char> raw_zero_utf8_str = {};
SCM sym = scm_from_utf8_symboln(raw_zero_utf8_str.data(), raw_zero_utf8_str.size());
return NULL;
}
int main(void) {
scm_with_guile(inner_main, NULL);
return 0;
}