---------- Forwarded message ---------
From: a aa <[email protected]>
Date: Mon, Aug 11, 2025 at 11:40 PM
Subject: Re: bug#79194: Segfault with 0 byte symbol
To: Tomas Volf <[email protected]>


Hello, Tomas <[email protected]>

> You wrote "handle 0 byte *strings*" (emphasis mine).  That is not what
> you are doing.  NULL is not a "0 byte string".  "" (almost) is.  Or a
> char* can be.  But not NULL.

The string "" does not have 0 bytes in utf8. It has 1 byte for the null
terminator,
which is still reading out of bounds when providing 0 for its length.

I probably should have mentioned this in the first message, but I
experienced
this issue in rust which doesn't use null terminated strings so the example
wasn't
exactly accurate.

    rustc -lguile-3.0 main.rs

or if you don't have rust

    c++ main.cpp `pkg-config --cflags --libs guile-3.0`

> Both work fine.  You cannot just send a null pointer to a function that
> is supposed to take a string and expect it to work.  So I do not think
> it is valid to declare this to be an "incorrect behaviour", maybe
> "unexpected" (by you) would be better description.

The function does not take a string, it takes an array of utf8 codepoints
so the function
should not expect the pointer to have a null terminator.

On Sun, Aug 10, 2025 at 2:34 PM Tomas Volf <[email protected]> wrote:

> Hi,
>
> a aa <[email protected]> writes:
>
> > Hello,
> >
> > how to reproduce:
> > cc main.c `pkg-config --libs --cflags guile-3.0` && ./a.out
> >
> > incorrect behaviour:
> > The second parameter for scm_from_utf8_symboln should be how many bytes
> are
> > pointed to by the pointer however the pointer still gets read if the
> length
> > is zero. Being able to handle 0 byte strings is expected since replacing
> > the call from scm_from_utf8_symboln to scm_from_utf8_stringn will not
> have
> > a segfault and the documentation for this function does not mention being
> > unable to handle 0 byte strings.
>
> Well, the scm_from_utf8_symboln is just not documented at all, so I am
> not sure how you have determined that the "documentation for this
> function does no mention ...".  But let us ignore that for a moment.
>
> You wrote "handle 0 byte *strings*" (emphasis mine).  That is not what
> you are doing.  NULL is not a "0 byte string".  "" (almost) is.  Or a
> char* can be.  But not NULL.
>
> >
> > [..]
> >
> > #include <libguile.h>
> >
> > void* inner_main(void*) {
> >   SCM sym = scm_from_utf8_symboln(NULL, 0);
>
> The line should be
>
>     SCM sym = scm_from_utf8_symboln("", 0);
>
> or
>
>     const char zero_str[] = {};
>     SCM sym = scm_from_utf8_symboln(zero_str, 0);
>
> Both work fine.  You cannot just send a null pointer to a function that
> is supposed to take a string and expect it to work.  So I do not think
> it is valid to declare this to be an "incorrect behaviour", maybe
> "unexpected" (by you) would be better description.
>
> Tomas
>
> --
> There are only two hard things in Computer Science:
> cache invalidation, naming things and off-by-one errors.
>
use std::{
    ffi::{c_char, c_void},
    ptr,
};

extern "C" {
    fn scm_with_guile(
        _: Option<unsafe extern "C" fn(_: *mut c_void) -> *mut c_void>,
        _: *mut c_void,
    ) -> *mut c_void;
    fn scm_from_utf8_symboln(_: *const c_char, _: usize) -> *mut c_void;
}

unsafe extern "C" fn inner_main(_: *mut c_void) -> *mut c_void {
    let str = "";
    unsafe {
        scm_from_utf8_symboln(str.as_ptr().cast(), str.len());
    }

    ptr::null_mut()
}

fn main() {
    unsafe {
        scm_with_guile(Some(inner_main), ptr::null_mut());
    }
}
#include <libguile.h>
#include <vector>

void* inner_main(void*) {
  std::vector<char> raw_zero_utf8_str = {};
  SCM sym = scm_from_utf8_symboln(raw_zero_utf8_str.data(), raw_zero_utf8_str.size());
  
  return NULL;
}

int main(void) {
  scm_with_guile(inner_main, NULL);

  return 0;
}

Reply via email to