On 10/20/25 12:33, Peter Maydell wrote: > On Mon, 20 Oct 2025 at 16:01, Sean Anderson <[email protected]> wrote: >> >> On 10/18/25 03:21, Heinrich Schuchardt wrote: >> > On 10/17/25 23:35, Sean Anderson wrote: >> >> When semihosting 32-bit systems, the return value of FLEN will be stored >> >> in a 32-bit integer. To prevent wraparound, return -1 and set EOVERFLOW. >> >> This matches the behavior of stat(2). Static files don't need to be >> >> checked, since are always small. >> >> >> >> Signed-off-by: Sean Anderson <[email protected]> >> >> --- >> >> >> >> semihosting/arm-compat-semi.c | 17 ++++++++++++++--- >> >> 1 file changed, 14 insertions(+), 3 deletions(-) >> >> >> >> diff --git a/semihosting/arm-compat-semi.c b/semihosting/arm-compat-semi.c >> >> index c5a07cb947..57453ca6be 100644 >> >> --- a/semihosting/arm-compat-semi.c >> >> +++ b/semihosting/arm-compat-semi.c >> >> @@ -305,8 +305,19 @@ static uint64_t common_semi_flen_buf(CPUState *cs) >> >> return sp - 64; >> >> } >> >> +static void common_semi_flen_cb(CPUState *cs, uint64_t ret, int err) >> >> +{ >> >> + CPUArchState *env = cpu_env(cs); >> >> + >> >> + if (!err && !is_64bit_semihosting(env) && ret > INT32_MAX) { >> > >> > >> > The issue with the current implementation is that files with file sizes >> > over 4 GiB will be reported as file size < 4 -GiB on 32bit systems. Thanks >> > for addressing this. >> > >> > But unfortunately with your change you are additionally dropping support >> > for file sizes 2 GiB to 4 GiB on 32bit devices. This should be avoided. >> > >> > The semihosting specification specifies that the value returned in r0 >> > should be -1 if an error occurs. So on 32 bit systems 0xffffffff should be >> > returned. >> > >> > As file sizes cannot be negative there is not reason to assume that the >> > value in r0 has to be interpreted by semihosting clients as signed. >> > >> > Please, change your commit to check against 0xffffffff. >> > >> > It might make sense to contact ARM to make their specification clearer. >> >> stat(2) will return -1/EOVERFLOW on 32-bit hosts for files over 2 GiB. I >> believe we should be consistent. > > I can see both sides of this one -- the semihosting spec > is pretty old and was designed (to the extent it was designed) > back in an era when 2GB of RAM or a 2GB file was pretty > implausible sounding. (And today there's not much appetite > for making updates to it for AArch32, because 32-bit is > the past, not the future, and in any case you have to deal > with all the existing implementations of the spec so > changes are super painful to try to promulgate.) > > Our QEMU implementation of SYS_ISERROR() says "anything that's > a negative 32-bit integer is an error number" for 32-bit hosts, > which I suppose you might count on the side of "check > against INT32_MAX". > > I think overall that if we think that anybody is or might be > using semihosting with files in that 2..4GB size, we should > err on the side of preserving that functionality. Otherwise > somebody will report it as a bug and we'll want to fix it > as a regression. And it doesn't seem impossible that somebody > out there is doing so. > > If we report 2..4GB file sizes as if the file size value > was an unsigned integer, then clients that expect that will > work, and clients that treat any negative 32-bit value as > an error will also work in the sense that they'll handle it > as an error in the same way they would have done for -1. > So that is the safest approach from a back-compat point > of view, and I think that's what I lean towards doing.
Actually, some existing targets don't handle "negative" file sizes properly. In particular, newlib generally treats the result as an int or an off_t, which is a long (except on cygwin). Both of those types are 32 bits or smaller on (I)LP32. So if you return a 3 GiB size it will be treated as -1 GiB. This will break lseek with SEEK_END, since newlib uses the signed result of flen to recompute a new absolute offset. IMO the spec is unclear, and this is reflected in the varying semantics of differing implementations. I think returning any negative number other than -1 is just inviting bugs. --Sean
