Re: compile-time deserialization

2023-12-26 Thread Marc Feeley


> On Dec 26, 2023, at 3:07 PM, Al  wrote:
> 
> Hi, suppose I need to reference data from "file.txt" in my Scheme program. I 
> could use read-char / read / open-input-file etc (or other (chicken io) 
> procedures) to read the the contents of the file into a variable.
> 
> However, such deserialization happens at run-time. If I compile my program 
> into an executable, it will try to open "file.txt" at run-time. How can I 
> arrange for "file.txt" to be read at compile-time, and inlined into the 
> executable, so that the executable doesn't require access to the original 
> file?
> 
> The only way I can think of is to convert "file.txt" to a scheme string 
> definition in a separate file:
> 
> ; file.scm
> (define file-contents " ... ")
> 
> and include it via (include "file.scm"). Then the definition would occur at 
> compile-time.
> 
> But of course this requires encoding (possibly binary) files as scheme 
> strings, and probably an extra Makefile step to convert file.txt into 
> file.scm. This is not attractive -- are there other options?
> 
> 

Hello. This is something that can be achieved through macros. The execution of 
the body of the macro definition is something that happens at compilation time, 
so it is as simple as reading the content of the file at that moment and 
transforming it into a constant:

$ cat embed-file.scm
;;; file: embed-file.scm

(import (chicken syntax))

(define-syntax embed-file-as-constant
   (er-macro-transformer
(lambda (x r c)
  (import (chicken io))
  (with-input-from-file (cadr x) (lambda () (read-string #f))

;;; sample use of embed-file-as-constant:

(define file-contents (embed-file-as-constant "file.txt"))

(display "file contents:\n")
(display file-contents)
$ cat file.txt
first line
second line
last line
$ csc embed-file.scm
$ ./embed-file
file contents:
first line
second line
last line

Marc





Re: compile-time deserialization

2023-12-26 Thread Kristian Lein-Mathisen
Hi,

You could try to put everything you're doing with read-string / read /
open-input-file inside syntax. For example:

~/tmp> cat ./compile-time-io.scm
(define file-contents
  (let-syntax
  ((load
(er-macro-transformer
 (lambda (x r t)
   (import chicken.io chicken.port)
   `(quote
 ,(with-input-from-file
 "/proc/cpuinfo"
   read-lines))
(load)))

(write file-contents)
(newline)

~/tmp> csc compile-time-io.scm && ./compile-time-io
( "Processor\t: AArch64 Processor rev 14 (aarch64)"
  "processor\t: 0"
  "BogoMIPS\t: 38.40" ... )

~/tmp> grep BogoMIPS ./compile-time-io
grep: ./compile-time-io: binary file matches


K.

On Tue, Dec 26, 2023, 21:07 Al  wrote:

> Hi, suppose I need to reference data from "file.txt" in my Scheme
> program. I could use read-char / read / open-input-file etc (or other
> (chicken io) procedures) to read the the contents of the file into a
> variable.
>
> However, such deserialization happens at run-time. If I compile my
> program into an executable, it will try to open "file.txt" at run-time.
> How can I arrange for "file.txt" to be read at compile-time, and inlined
> into the executable, so that the executable doesn't require access to
> the original file?
>
> The only way I can think of is to convert "file.txt" to a scheme string
> definition in a separate file:
>
> ; file.scm
> (define file-contents " ... ")
>
> and include it via (include "file.scm"). Then the definition would occur
> at compile-time.
>
> But of course this requires encoding (possibly binary) files as scheme
> strings, and probably an extra Makefile step to convert file.txt into
> file.scm. This is not attractive -- are there other options?
>
>
>


Re: compile-time deserialization

2023-12-26 Thread siiky via
Oh I forgot to mention you need an extra compiler flag to use it 
(-extend). And I also just remembered that I wrote a post about this at 
the time. (:


https://siiky.srht.site/scheme/reader-syntax.html





Re: compile-time deserialization

2023-12-26 Thread siiky via

Hi,

You can use reader syntax[0]. Take a look at this SQL reader syntax[1,2] 
I wrote some time ago for inspiration (it should require very little 
changing).


[0] https://api.call-cc.org/5/doc/chicken/read-syntax/set-read-syntax%21
[1] 
https://git.sr.ht/~siiky/save-for-later/tree/c7297083127ac1543bf6013e27259b35d07236ee/item/sql-reader-syntax/sql-reader-syntax.scm
[2] 
https://git.sr.ht/~siiky/save-for-later/tree/c7297083127ac1543bf6013e27259b35d07236ee/item/sql-reader-syntax/example.scm


siiky





Re: cannot open file if name contains accented characters

2023-12-26 Thread felix . winkelmann
> Thanks for the responses. I tried what I could, but it still doesn't work.
> I wrote some code to test if I can open and close a file. The C code works
> but the Chicken code doesn't.
>
> (import (chicken foreign))
>
> (foreign-declare "#include ")
> (foreign-declare "#include ")
>
> (define chicken_wfopen
> (foreign-lambda (c-pointer "FILE") "_wfopen" (c-pointer "wchar_t")
> (c-pointer "wchar_t")))
> (define chicken_fclose
> (foreign-lambda int "fclose" (c-pointer "FILE")))
>
> (let ([file-handle (chicken_wfopen "c:\\temp\\íűőúöüóéá.txt" "r")])
> (print "File has been opened at: " (number->string file-handle))
> (print "Closing file.")
> (chicken_fclose file-handle))
>
> The Chicken code fails with the error message: Error: unbound variable:
> It doesn't say the name of the variable, and it returns the error code 70.
> Note that in C the string needs to be prefixed with an L to make it wide
> character. In Chicken I don't know how to do that.
> Does anybody have a clue which variable is unbound?

Ugh. Sorry, I don't know what variable is meant here. I also gave wrong
information: _wfopen returns a FILE* (as you correctly implement above).
The returned pointer will not be a number, though.

Anyway: the strings you pass are Scheme strings, passes as char *
to the FFI code, this is wrong, as you want UTF-16 Windows strings
(wchar_t *) here. The proper way would be (I think) to do

;; warning: totally untested

(define wfopen
  (foreign-lambda* bool ((c-string str) (c-string mode) (scheme-object port))
   "int sz1 = 4 * (strlen(str) + 1), sz2 = 4 * (strlen(mode) + 1);
 wchar_t *buf1 = malloc(sz1), buf2 = malloc(sz2);
 FILE *fp;
 MultiByteToWideChar(CP_UTF8, 0, str, -1, buf1, sz1);
 MultiByteToWideChar(CP_UTF8, 0, mode, -1, buf2, sz2);
 fp = _wfopen(buf1, buf2);
 free(buf1); free(buf2);
 if(fp == NULL) return(0);
 C_set_block_item(port, 0, (C_word)fp);
 return(1);"))

(define (fopen-utf16 fname mode input?)
  (let ((p (##sys#make-port (if input? 1 2) ##sys#stream-port-class name 
'stream)))
(if (wfopen fname mode)
p
(error "error opening file" fname mode

the ##sys#make-port thing above creates a proper port to be used in Scheme,
the input? flag is unfortunately needed to distinguish between input and output
ports. Sorry if all this is a bit confusing.


felix





compile-time deserialization

2023-12-26 Thread Al
Hi, suppose I need to reference data from "file.txt" in my Scheme 
program. I could use read-char / read / open-input-file etc (or other 
(chicken io) procedures) to read the the contents of the file into a 
variable.


However, such deserialization happens at run-time. If I compile my 
program into an executable, it will try to open "file.txt" at run-time. 
How can I arrange for "file.txt" to be read at compile-time, and inlined 
into the executable, so that the executable doesn't require access to 
the original file?


The only way I can think of is to convert "file.txt" to a scheme string 
definition in a separate file:


; file.scm
(define file-contents " ... ")

and include it via (include "file.scm"). Then the definition would occur 
at compile-time.


But of course this requires encoding (possibly binary) files as scheme 
strings, and probably an extra Makefile step to convert file.txt into 
file.scm. This is not attractive -- are there other options?




Re: cannot open file if name contains accented characters

2023-12-26 Thread Mátyás Seress
Thanks for the responses. I tried what I could, but it still doesn't work.
I wrote some code to test if I can open and close a file. The C code works
but the Chicken code doesn't.

-- test.c --

#include 
#include 

// this file needs to be saved with explicit BOM (byte order mark)
otherwise it won't work
int main()
{
FILE* fileHandle = _wfopen(L"c:\\temp\\íűőúöüóéá.txt", L"r");
printf("File has been opened at: %p\n", fileHandle);
printf("Closing file\n");
fclose(fileHandle);
return 0;
}

-- test.scm --

(import (chicken foreign))

(foreign-declare "#include ")
(foreign-declare "#include ")

(define chicken_wfopen
(foreign-lambda (c-pointer "FILE") "_wfopen" (c-pointer "wchar_t")
(c-pointer "wchar_t")))
(define chicken_fclose
(foreign-lambda int "fclose" (c-pointer "FILE")))

(let ([file-handle (chicken_wfopen "c:\\temp\\íűőúöüóéá.txt" "r")])
(print "File has been opened at: " (number->string file-handle))
(print "Closing file.")
(chicken_fclose file-handle))

The Chicken code fails with the error message: Error: unbound variable:
It doesn't say the name of the variable, and it returns the error code 70.
Note that in C the string needs to be prefixed with an L to make it wide
character. In Chicken I don't know how to do that.
Does anybody have a clue which variable is unbound?

Op ma 25 dec 2023 om 20:19 schreef John Cowan :

>
>
> On Mon, Dec 25, 2023 at 6:07 AM  wrote:
>
>
>> I'm not too familiar with the way Windows handles non-ASCII characters
>> in operating system calls, but I assume that what gets passed to the C
>> library runtime functions like fopen(3), etc. assumes a particular
>> encoding.
>>
>
> Basically, there are two modes, one that assumes a particular encoding, as
> you say (that's the default) and one that assumes wchar_t, which is always
> UTF-16LE.  Which encoding is used in the first mode depends on the locale
> setting.
>
> From a quick glance at the Windows docs[1] it seems one needs to use
>> "_fwopen" with a wchar_t string argument to pass extended characters.
>>
>
> Indeed, except that it's _wfopen, not _fwopen. Note that _fopen can
> involve 8-bit, 16-bit, or 8/16-bit mode depending on the encoding.
>
> Sorry, if this is not overly helpful. We are currently in the process of
>> improving
>> the unicode support for the next major version of CHICKEN.
>>
>
> This makes me realize that posixwin needs to be changed in C6 so that it
> always uses the second mode.  A simple way to do this is to use a UTF-8 to
> UTF-16BE converter (and vice versa for things like dirread) right before
> calling _fwopen.
>
>
>>
>> felix
>>
>>
>>