passing constants as "textual literals" to foreign-lambdas?

2024-03-04 Thread Al

Suppose that there is a C function

int result;
void compute( int x, int y /* several more */) {
  result = /* some complex computations */ ;
}

I could get a hold of "result" using foreign-value, and I could declare 
"compute" as a foreign-lambda. If I wanted to pass the constants x=3, 
y=4 (plus some other variables) to compute(), csc would C_fix() the 
constants then C_unfix() them to pass them to the "foreign lambda".


Is there any way to avoid the C_fix() / C_unfix() spiel and pass 
constant values directly as textual literals to compute() -- more 
exactly, to csc which in turn outputs the compute() calls in the 
generated .c file?


Specifically, I have some offsets into a vector that are computed via 
macros. The offsets vary between different invocations of the macros, 
but are known at compile time. compute(), called from the macros, 
operates on the vector at the given offsets (plus a variable 
base-index). I would like to pass those constant offsets directly, 
without generating useless (and slow) code to fix() / unfix() them.


The one work-around I can think of is to write an er-transformer macro 
to generate strings for foreign-primitive / foreign-lambda* 
specializations of compute() with certain constant args. This doesn't 
seem so nice though.



Thanks,
Al



Re: integer64 foreign type ( manual/Foreign%20type%20specifiers.html )

2024-03-02 Thread Al

[ I replied to Felix only by mistake, so forwarding ]


Are you sure you are looking at the latest git HEAD? Both of
your assumptions are correct, but api.call-cc.org and git master
look correct to me.


Ah, you're right, thank you. It's the 5.3.0 manual that mentions the 
flonum fallback, even though csc 5.3.0 generates C_int64_to_num just 
like git-head.



I got confused by building 4 chicken versions (git-head / 5.3.0 + gcc-12 
/ llvm17), trying to chase down the performance problems I mentioned in 
the "csi slow" thread. It wasn't any of that; at this point it *seems* 
to be building with C_COMPILER_OPTIMIZATION_OPTIONS set to 
-march=native. This flag (maybe any flags, I don't know yet) not only 
make csi but also csc output twice as slow for my program, for all 4 CC 
/ chicken-version combos (on my platform, CPU etc -- I'll write more in 
the other thread).



-- Al



integer64 foreign type ( manual/Foreign%20type%20specifiers.html )

2024-03-02 Thread Al

The manual (git current version) says:

---

integer64type
unsigned-integer64type
A fixnum or integral flonum, mapping to int64_t or uint64_t. When 
outside of fixnum range the value will overflow into a flonum.


[...] (Numbers between 2^62 and 2^64-1 have gaps.)

---


First of all I don't understand why it's documented as integerNNtype 
when in foreign prims/values one uses "integerNN", not "integerNNtype", 
both as arguments and return values.



But also I don't see any "flonum" (float or double I assume) in 
translated C code. Instead I see things like "C_int64_to_num" which 
seems to return bignums when out of fixnum range; no flonums, no gaps, 
values print out as integers, (exact? ...) prints #t even for 
(u)int64-max. Is the documentation out of date?



Thanks, Al




Re: "cannot coerce inexact literal to fixnum"

2024-02-24 Thread Al

On 2024-02-10 16:39, Peter Bex wrote:


Again, these could still be bignums when run on a 32-bit platform
(that's why the return type of s32vector-ref is defined as "integer" and
not "fixnum")


Hm.. does this mean that using s32vector-ref and -set! always incur 
typecheck costs, no matter how I type-declare identifiers passed to / 
from them? And, furthermore, that if replace these with custom-written 
(foreign-primitive ...)'s, having integer32 args / returns, those would 
also incur typechecks, and thus never be as efficient as C...?



I've generally taken to avoiding the s32vector-ref/set! builtins. 
Instead I mostly keep data in SRFI-4 vectors and manipulate it with 
C-backed foreign-primitives, without ever extracting to fixnums. For 
example I have things like (s32vector-move1! vec src-idx dst-idx) 
(copies an element from src-idx to dst-idx). But even for these, I 
wonder what type the indexes should be declared to minimize typechecks 
at the interface between C and Chicken...? Perhaps even how to make them 
inline-able to simple C array ops?



Best, Al




csi from git version: slow?

2024-02-24 Thread Al
Hello, I've tried the git version (2024-02-23, compiled with LLVM-17 
clang, on a Debian 12 / linux 6.6.x). While csc and the executables it 
generates are a bit faster, running my program under csi is now twice as 
slow compared to the latest release (5.3.0).



Now before I engage in other investigations, like compiling with GCC, 
checking the nature of my program (sets a large number of globals by 
running some DSL-type macros hundreds of times, would be the most 
obvious distinguishing characteristic), I need to ask this first: is 
this to be expected? Are there some "development version" / debug flags 
/ expensive interpreter features that differ compared to stable? I 
customized only config.make by changing the C_COMPILER and PREFIX.



Thanks, Al




Re: "cannot coerce inexact literal to fixnum"

2024-02-10 Thread Al

On 2024-02-10 15:38, Peter Bex wrote:


That's because you're using fixnum mode.  As I explained, using literals
that might be too large for fixnums break the fixnum mode's premise that
everything must be a fixnum.


Oh. So it refuses to emit C code that might break on 32-bit at runtime 
(silent truncation of atoi result, presumably), preferring instead to 
definitely break during Scheme compilation on any platform. OK, I get 
the rationale.




That's because string->number gets constant-folded and evaluated at
compile-time.



Obviously; and I suppose there's no simple way to prevent that for just 
one line, not the entire unit?



It would help if you tried to explain exactly _what_ you're trying to do
here, instead of _how_ you're trying to do it.  Why do you need these
constants?


I did mention twice that I'm using them to implement int->u32. There are 
also places where I need to increment-and-wrap int32's by INT32_MIN. I'm 
writing a Forth compiler incidentally (which may or may not have been a 
good idea). I store values in s32vector's, but they get turned into 
Scheme numbers by s32vector-ref. I guess I'd prefer native s32int/u32int 
types, complete with wrap-around and meaningful conversion to Scheme 
ints, but I don't think that exists.



It does that (more or less), as I explained. And it *wouldn't* work,



Yeah, I understand why now. I suppose the best way is to use foreign 
values. Maybe I should switch the arithmetic code to C too. Thanks.



-- Al


Re: (declare (pure ...))

2024-02-10 Thread Al

On 2024-02-10 14:28, Peter Bex wrote:


CHICKEN's type system differentiates between "pure" and "clean". A "pure"
Hope this clears things up a bit!



Ah, that's what I was looking for. So I shouldn't declare procedures 
using vector-ref as pure, but as clean. Now how do I declare a scheme 
procedure as "clean" -- I don't see that in the manual? I'll check out 
the source, though grep'ing for vector-ref is obviously not an option.



-- Al




Re: "cannot coerce inexact literal to fixnum"

2024-02-10 Thread Al

On 2024-02-10 13:00, Peter Bex wrote:


These so-called "big-fixnums" are compiled into a string literal which gets 
decoded on-the-fly at runtime into either a fixnum (on 64-bit) or a bignum (on 32-bit).



That would be fine but where does that happen? csc actually barfs on my 
Scheme code (as per the subject line), instead of emitting C code to 
encode/decode into a string at runtime, as you mention. It won't even 
let me use string->number  by hand. The only thing that worked was



(cond-expand
  (csi
    (define  INT32_MAX  #x7fff)
    (define  INT32_MIN #x-8000)
    (define UINT32_MAX  #x)
    )
  (else
    ; chicken csc only does 31-bit literals in fixnum mode
    (define INT32_MIN  (foreign-value "((int32_t)  0x8000)" integer32))
    (define INT32_MAX  (foreign-value "((int32_t)  0x7fff)" integer32))
    (define UINT32_MAX (foreign-value "((uint32_t) 0x)" 
unsigned-integer32))

    )
  )


... and I'm not sure what the implications of using a "foreign value" 
further down in my program are. If I assign them to another variable, 
does that variable also become a "foreign value"? How about if I do 
(bitwise-and IMAX32 int)  to truncate a signed number to unsigned32 
(which is what I'm actually using them for)?




There's (currently) no option to force fixnum mode in a way that ignores
the existence 32-bit platforms.  Theoretically, it should be possible to
compile your code assuming fixnums (so it emits C integer literals) and
make it barf at compilation time if one tried to build for a 32-bit
platform using a #ifdef or something.  We just don't have the required
code to do this, and I'm not sure this is something we'd all want.



Well if csc emitted string->number code in fixnum mode when necessary, 
that would at least work. Although if I'm using fixnum mode, I'm 
probably looking for performance, and I'm not sure the subsequent C 
compiler is smart enough to optimize the "atoi" or whatever away into a 
constant. Maybe it is nowadays.



Otherwise, how do I write Scheme code to truncate a signed number to 
unsigned32? Resort to foreign values as I did above (or write foreign 
functions)?



-- Al




Re: (declare (pure ...))

2024-02-10 Thread Al

On 2024-02-10 11:53, Pietro Cerutti wrote:

I don't see why vector-ref would be any less pure than, say, a let 
binding. Or do you mean vector-set! ?


vector-ref, applied to a global, could return different values even when 
called with the same arguments. Between calls, some other code could 
modify the contents of the vector. So according to referential 
transparency, vector-ref could not be pure. Likewise a function that 
calls vector-ref could not be pure.



Further, according to referential transparency no procedure that *reads* 
a global (not just vectors -- even imediate values like numbers, 
booleans) can be pure. It might return different values if impure code 
modifies the global between calls to the procedure in question.



In contrast, a let binding can only be modified by the code in the 
current scope.



There should still be a way to communicate to the optimizer that 
vector-ref, or some procedure that uses vector-ref on a global 
identifier (but does not call set! / vector-set! on globals) does not, 
uh, modify any globals.



-- Al




Re: (declare (pure ...))

2024-02-10 Thread Al

On 2024-02-10 11:20, Pietro Cerutti wrote:

Both Haskell and CHICKEN ultimately compile to obiect code. That is 
not important: the important thing is the abstract machine you're 
programming against. This is why I specified "observable" in my 
previous reply.


I agree. And if we step out of the monadic framework for just a bit, 
you'll see that there's room for an abstract machine in which purity 
just means "does not set! globals". This would still let the optimizer 
know that globals do not need to be re-checked after such a procedure is 
invoked, for example. I'm not sure if csc really uses the full 
"referentially transparent" definition (i.e. memoize-able), or just the 
"does not modify globals" one.



On a practical level, I would be sad if vector-ref, for example, was 
"impure", and thus compiling a vector-ref invalidated all 
previously-checked globals for the current scope. Likewise, I would 
prefer to declare a procedure using vector-ref as pure, to let csc know 
that it does not modify globals (or the filesystem etc).





Yeah their wording is different but the meaning is the same, i.e., a 
--> function will be marked as pure (for the implementation, see the 
last value returned by validate-type in scrutinizer.scm).


I think that part just decides whether the procedure is "pure", for 
whatever purity means. That doesn't say anything about what compiling a 
call to such a "pure" procedure means (can memoize the value or not, 
need to re-check globals or closures afterwards or not etc).



-- Al




Re: (declare (pure ...))

2024-02-10 Thread Al

On 2024-02-10 10:13, Pietro Cerutti wrote:


I don't get your question: those two things are the same thing :)

Referential transparency means you can substitute an expression with 
its expansion down to a value. If anything happening in between causes 
(observable *) changes, you can't do it anymore.


(*) modifying the program counter is not an observable change, for 
example.


Those two things are the same thing in Haskell and languages that have a 
mathematical model of the program, yes. Scheme is... not that, much of 
the time. Chicken is implemented on top of C, so it's even less clear.



But let's not get theoretical, yes, my question is if csc can (does?) 
"memoize" the results of "pure" function.



In any case, the definitions I quoted above

* https://wiki.call-cc.org/man/5/Declarations#pure

* https://wiki.call-cc.org/man/5/Types#purity

are not the same thing, so either the one or the other should stand. 
Though I'm not sure what "local variables or data contained in local 
variables" actually means. I think it means you can't "set!" globals.



-- Al




Re: (declare (pure ...))

2024-02-10 Thread Al




* only side-effect free, or ...

* also return the same value when called with the same arguments?


The first implies the second: to be able to choose from a set of 
return values for the same given argument, you do need to have 
side-effects, e.g., interact with a RNG which maintains state, read 
from a file, maintain an index into a circular vector of results, etc..


Here's what the docs say:
http://wiki.call-cc.org/man/5/Declarations#pure

"referentially transparent, that is, as not having any side effects". 
You can read "the environment" (including all globals) without actually 
modifying the environment. Between two applications of the procedure, 
other impure procedures may modify the environment. So it's not clear if 
it means Haskell monad purity or immutable purity, hence my question.



Furthermore, https://wiki.call-cc.org/man/5/Types#purity says: "Using 
the (... --> ...) syntax, you can declare a procedure to not modify 
local state, i.e. not causing any side-effects on local variables or 
data contained in local variables".



-- Al




Re: "cannot coerce inexact literal to fixnum"

2024-02-09 Thread Al

On 2024-02-10 02:42, Al wrote:

... if I enable fixnum, csc chokes on both the third and fourth 
display's with: "Error: cannot coerce inexact literal `2147483647' to 
fixnum". It compiles and runs fine if those lines are commented out 
(or if fixnum is disabled).


So the error comes from a check for (big-fixnum?) in 
chicken-core/core.scm. It is defined in chicken-core/support.scm:


 (define (big-fixnum? x) ;; XXX: This should probably be in c-platform
    (and (fixnum? x)
 (feature? #:64bit)
 (or (fx> x 1073741823)
 (fx< x -1073741824) ) ) )

  (define (small-bignum? x) ;; XXX: This should probably be in c-platform
    (and (bignum? x)
 (not (feature? #:64bit))
 (fx<= (integer-length x) 62) ) )

Maybe the condition in big-fixnum should be negated? Apply the 
restrictive #x3fff limit only when NOT (feature? #:64bit) ?



Also, I've looked at Ken's number-limits egg. It has


#define MOST_POSITIVE_INT32 ((int32_t) 0x3fffL)


I don't know why this (as opposed to 0x7fff) would apply to #:64bit 
installs..? In any case the documentation on call-cc.org says


* most-negative-integer32
Smallest negative int32_t value
* most-positive-integer32
Largest negative (sic!) int32_t value


... which is ALSO wrong for most-positive-integer32 , but it does refers 
to int32_t, as opposed to chicken's internal representation (which uses 
up an extra bit.)





"cannot coerce inexact literal to fixnum"

2024-02-09 Thread Al

(import (chicken fixnum))
; (declare (fixnum))
(define mp most-positive-fixnum)

(display mp) (newline)
(display (string->number (number->string mp))) (newline)

(display #x7fff) (newline)
(display (string->number "#x7fff")) (newline)


Obviously the first number is much greater than INT32_MAX. However, if I 
enable fixnum, csc chokes on both the third and fourth display's with: 
"Error: cannot coerce inexact literal `2147483647' to fixnum". It 
compiles and runs fine if those lines are commented out (or if fixnum is 
disabled).



Not sure what's wrong here. Tried with both 5.3.0 and 5.3.1-pre.


-- Al




(declare (pure ...))

2024-02-09 Thread Al

Hi,


what does (declare (pure ..)) mean to csc? Is the function supposed to be

* only side-effect free, or ...

* also return the same value when called with the same arguments?


Thanks,

Al




compiler types: records / collections interaction

2024-02-09 Thread Al
Suppose I need a record of a record type myrec, and collections 
(vectors-of, lists-of and hash-tables) with values myrec.



What combination of

* define-record-type (srfi-9, srfi-99, chicken define-record, whatever) and

* (declare type ...)

can I use to inform the compiler that (for collections of myrec) 
vector-ref returns myrec's (likewise list car/cdr, hash-table-ref)? And 
that it needs not emit any instance-type-checking code for objects 
extracted from such collections?



Furthermore, how can I see what the compiler thinks of a given 
identifier? I've used 'csc -ot' but it only emits types for some 
identifiers. Maybe it inlines others, I don't really know.



Separately, how can I tell the compiler that fields in these records 
have certain types? Add type declarations for the implicitly-defined 
per-field accessors and mutators?



I've tried unwrapping collections of myrec, and also myrec fields, and 
it seems to make a huge difference in the speed of compiled code. 
Presumably because I don't know how to tell the compiler to omit type 
checking in "safe" cases. I know I could use some of the more aggressive 
optimization levels, but I don't really want to compile unsafe code 
everywhere, just where  I'm using the correct types.



Thanks,

Al




chicken (un)install, deployment

2024-02-03 Thread Al

 Hi,


I'd like to distribute a project that uses chicken and a number of eggs. 
In the Makefile, I'm trying to add a target that ensures the user has 
those eggs installed. However,


* chicken-install seems to proceed unconditionally (even if egg is 
already installed).


* I see no chicken-uninstall, or at least chicken-remove-all-eggs, so 
that at least I can test things in a clean-room installation, without 
rebuilding



I came up with a way sed hack to extract all imports, transform those to 
arguments for "chicken-status -list", and generate a chicken-status.scm 
file (which can then be processed by "chicken-install -from-list"). But 
still it installs unconditionally. I could add an extra step to check if 
"csi import egg... 


What to do? Maybe I should convert my project to egg format?


-- Al




compile-time deserialization

2023-12-26 Thread Al
Hi, suppose I need to reference data from "file.txt" in my Scheme 
program. I could use read-char / read / open-input-file etc (or other 
(chicken io) procedures) to read the the contents of the file into a 
variable.


However, such deserialization happens at run-time. If I compile my 
program into an executable, it will try to open "file.txt" at run-time. 
How can I arrange for "file.txt" to be read at compile-time, and inlined 
into the executable, so that the executable doesn't require access to 
the original file?


The only way I can think of is to convert "file.txt" to a scheme string 
definition in a separate file:


; file.scm
(define file-contents " ... ")

and include it via (include "file.scm"). Then the definition would occur 
at compile-time.


But of course this requires encoding (possibly binary) files as scheme 
strings, and probably an extra Makefile step to convert file.txt into 
file.scm. This is not attractive -- are there other options?