Re: German Umlauts / UTF8 with comparse
Ah!, right. Thanks! ... if I remember correctly, that was also discussed in the older mail thread about parsing Japanese, when Moritz said that he didn't want to make comparse users dependent on utf8. Works well now, and also thanks for mentioning the ,d trick! On Tue, Feb 18, 2020 at 12:44 PM wrote: > Christoph Lange wrote: > > Yes, this helps. Kind of ;-) ... using the character set > > char-set:alphabetic, my umlauts are now parsed. But I don't get them back > > in my result, at least not as printable characters. Instead, the > following > > happens, and utterly confuses me: > > Hmm, indeed. From what I can see, the result of parse is not encoded in > UTF-8. > > I went to see comparse’s code and found that the (as-string) combiner > uses (->string) internally. But since comparse doesn’t use the utf8 egg, > it uses the core version of (->string), which happens to encode #\ä in > latin-1! > > The only workaround I can think of right now is to move the conversion > back to a string out of the comparse egg and into your own, utf8 aware, > code. > > This would look something like this: > > > (import comparse utf8 utf8-srfi-14 unicode-char-sets) > > (define s "Gänsesäger 2,1") > (define s1 "Rotkehlchen 1,0") > > (define (utf8-in cs) > (satisfies (lambda (c) (char-set-contains? cs c > > (define letter > (utf8-in char-set:alphabetic)) > > (define letters > (repeated letter 1 20)) > > (define (parse-as-string parser input) > (list->string (parse parser input))) > > (define p1 (parse-as-string letters (string->list s1))) > (define p (parse-as-string letters (string->list s))) > > > PS: a trick I used to check the encoding of the strings was using the ,d > csi command, which prints the contents of the string byte by byte. There > it’s easy to see if non ascii characters indeed take more than one byte > as they should in UTF-8. > -- Christoph Lange Lotsarnas Väg 8 430 83 Vrångö
Re: German Umlauts / UTF8 with comparse
Yes, this helps. Kind of ;-) ... using the character set char-set:alphabetic, my umlauts are now parsed. But I don't get them back in my result, at least not as printable characters. Instead, the following happens, and utterly confuses me: #;2> (define s3 (parse letters (string->list s))) #;3> s3 "Gnsesger" #;4> (string-length s3) 6 #;5> (string->list s3) (#\G #\x4bb3 #\e #\s #\x49e5 #\r) #;6> (list->string (string->list s3)) "G䮳es䧥r" So, I put the parse result into 's3'. Printing it, I read an eight character string, namely the one I want, minus my beloved umlauts. 'string-length' returns that string to be six characters long, and 'string->list' gives me exactly that, swallowing still other ASCII characters of my string and reversing that using 'list->string' includes Chinese ... even though '(list->string (string->list s1))', with my pure ASCII string, reverses without fault. I guess I have some problems understanding some utf8 concepts?! /Christoph On Mon, Feb 17, 2020 at 3:38 PM wrote: > Christoph Lange wrote: > > meaning, that the ä isn't recognized as being a letter within the > > 'char-set:letter'. > > The utf8 egg’s srfi-14 character sets are designed to be compatible with > the original srfi-14 and only contain ASCII characters, as stated in the > documentation: > https://wiki.call-cc.org/eggref/5/utf8#unicode-char-sets > “The default SRFI-14 char-sets are defined using ASCII-only characters” > > You might want to import the unicode-char-sets module, and use one of its > sets, like char-set:alphabetic. > > I hope this helps. :) > -- Christoph Lange Lotsarnas Väg 8 430 83 Vrångö
German Umlauts / UTF8 with comparse
I read older threads about parsing Japanese with comparse and took some ideas from there, but am still stuck: (import comparse utf8 utf8-srfi-14) (define s "Gänsesäger 2,1") (define s1 "Rotkehlchen 1,0") (define (utf8-in cs) (satisfies (lambda (c) (char-set-contains? cs c (define letter (utf8-in char-set:letter)) (define letters (as-string (repeated letter 1 20))) This is what I have, and the beginning 'word' in the beginning of s1 is parsed completely and correctly with the 'letters' parser: #;1> (parse letters (string->list s1)) "Rotkehlchen" # ; 2 values For 's' though I get this: #;2> (parse letters (string->list s)) "G" # ; 2 values meaning, that the ä isn't recognized as being a letter within the 'char-set:letter'. (The UTF8 aspect of correct character width works on the other hand: in the remaining string, the ä is represented by only one #\. If I don't use the UTF8 string equivalents by importing 'utf8', it would be two.) Any hint for me? /Christoph -- Christoph Lange Lotsarnas Väg 8 430 83 Vrångö
[Chicken-users] FFI and callbacks -- in Scheme?!
NULL > (define mqttc (mosquitto-new #f #t #f)) > (display "mqtt client address\n") > (display mqttc) (newline) > (mosquitto-connect-callback-set mqttc #$on_connect) > (mosquitto-message-callback-set mqttc #$on_message) > (display "connect to broker\n") > (display (mosquitto-connect mqttc "localhost" 1883 60)) > (newline) > (mosquitto-subscribe mqttc #f "greetings/#" 0) > (define payload (string->blob "gluck, gluck!")) > (mosquitto-publish mqttc "chicken/call" payload 0 #f) > (let loop () > (mosquitto-loop mqttc -1 1) > (loop)) > ;;(mosquitto-loop-forever mqttc -1 1) > ;; (mosquitto-loop mqttc -1 1) > (mosquitto-destroy mqttc) > (mosquitto-lib-cleanup) -- Christoph Lange Lotsarnas Väg 8 430 83 Vrångö ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] more 'foreign' questions
Thanks! This all works now, and I learned a lot. I now get a void pointer to some data in memory, and the length of the > payload. How can I make that e.g. a blob? Or a string? > > we’re trying to stop using string for non-char data so blob please ;-) > Sorry. Didn't want to frighten you ;-) I almost always have strings in there (JSON), that's why I asked. ... but when I put everything together, hope it will end up as a new egg, I'll surely return the blob, so that people can decide freely. ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] more 'foreign' questions
Awesome, thanks! Will try. Have a good flight! On Wed, Mar 27, 2019 at 8:19 PM Kon Lovett wrote: > (gotta flight coming up so dashing but hth) > > probably need to use the length for make-blob & then move-memory! from > pointer to the alloc’ed blob > > see (chicken memory) > > > On Mar 27, 2019, at 12:14 PM, Christoph Lange > wrote: > > > > i haven’t used the bind egg but the documentation "General Operation” > section beginining with "Structure and union definitions …” seems relevant. > > > > Haha, yes, thanks. Finding the relevant parts of the docs seems to be > the challenge in the beginning. Will read that. > > > > it rolls access routines, ex: mosquitto_message-mid, > mosquitto_message-payload, ... > > > > > > #;1> (import bind) > > #;2> ,x* (bind* "struct mosquitto_message{ > > int mid; > > char *topic; > > void *payload; > > int payloadlen; > > int qos; > > ___bool retain; > > };”) > > > > Oh, useful tool to learn, as it seems. > > > > Thanks for the help. ... one follow-up question: > > > > I now get a void pointer to some data in memory, and the length of the > payload. How can I make that e.g. a blob? Or a string? > > -- Christoph Lange Lotsarnas Väg 8 430 83 Vrångö ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
Re: [Chicken-users] more 'foreign' questions
> > i haven’t used the bind egg but the documentation "General Operation” > section beginining with "Structure and union definitions …” seems relevant. > Haha, yes, thanks. Finding the relevant parts of the docs seems to be the challenge in the beginning. Will read that. it rolls access routines, ex: mosquitto_message-mid, > mosquitto_message-payload, ... > #;1> (import bind) > #;2> ,x* (bind* "struct mosquitto_message{ > int mid; > char *topic; > void *payload; > int payloadlen; > int qos; > ___bool retain; > };”) > Oh, useful tool to learn, as it seems. Thanks for the help. ... one follow-up question: I now get a void pointer to some data in memory, and the length of the payload. How can I make that e.g. a blob? Or a string? > ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
[Chicken-users] more 'foreign' questions
After I now managed quite a lot of my interfacing to the mqtt library, I'm stuck with the following: I have the following definition of a message struct, which I in fact get back a pointer to, from a callback: (bind* "struct mosquitto_message{ int mid; char *topic; void *payload; int payloadlen; int qos; ___bool retain; };") When I print what I get in Scheme it says something along the line of #. Though the library frees the memory, it will -- according to manual -- only do so after the callback returns. So *within* the callback, I should be able to access / print it's content. Just *how*?! How do I access the different fields in that struct? Especially the payload? /Christoph ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
[Chicken-users] NULL in calls to C functions with bind
I wrote the following in my attempt to interface to the mosquitto MQTT library: (bind* "struct mosquitto *mosquitto_new(const char *id, ___bool clean_session, void *obj);") (define NULL (object->pointer 0)) (define mqttc (mosquitto-new NULL #t NULL)) But I'm unsure about my adventurous definition of `NULL`. It works, but is it correct? Another thing: on the bind egg's documentation page, `___blob` is not mentioned, but I luckily found it in the sql-de-lite code, and it saved me a lot of headaches :-) Shouldn't it be there? /Christoph ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
[Chicken-users] csc switch for linking foreign libraries?
Dear Chickeneers! I might be stupid or blind, but I can't find a way of announcing foreign libraries to 'csc' that have to be linked to my binary. I have the following code: *(import bind)* *(bind "int mosquitto_lib_init(void);")* *(display (mosquitto_lib_init))* When doing 'csc mqtt.scm' on it, I get: *[clange@sencha chicken-ffi]$ csc mqtt.scm* *mqtt.c: In function ‘stub30’:* *mqtt.c:27:8: warning: implicit declaration of function ‘mosquitto_lib_init’ [-Wimplicit-function-declaration]* * return(mosquitto_lib_init());* *^~* *mqtt.c:24:54: note: in definition of macro ‘return’* * #define return(x) C_cblock C_r = (C_int_to_num(_a,(x))); goto C_ret; C_cblockend* * ^* */usr/bin/ld: mqtt.o: in function `f_154':* *mqtt.c:(.text+0x611): undefined reference to `mosquitto_lib_init'* *collect2: error: ld returned 1 exit status* *Error: shell command terminated with non-zero exit status 256: 'gcc' 'mqtt.o' -o 'mqtt' -L/usr/lib -Wl,-R/usr/lib -lchicken -lm -ldl* Understandably. Since I didn't link libmosquitto.so to it. But *HOW*?! I tried '-lmosquitto', but that's not passed on to gcc. Which is the correct way? ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
[Chicken-users] Error installing egg, but only on Raspbian
I adapted the Telebot (Telegram client) egg to be installable as Chicken 5 egg: https://github.com/recumbentbirder/Telebot/tree/make-chicken-5-egg This works fine to install on my Arch Linux (Chicken installed via packet manager) to install via *chicken-install -s*. On my Raspberry Pi 3+ I installed Chicken 5.0 from the installation archive of the Chicken home page. When I try to install Telebot via *chicken-install -s*, I get the very short (concise?! ;-) error message: *Error: (assq) bad argument type: #!eof* ... with no information whatsoever, *where* something goes wrong. I read, that the exhausted file pointer will show #!eof instead of () now in Chicken 5. I'm a bit lost here as to *where to start* looking to solve this. Any suggestions / comments? ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users
[Chicken-users] chicken-bind: don't get output for 'mosquitto.h'
Dear Chicken Schemers, I have problems using 'chicken-bind'. I'm trying to make 'mosquitto', the MQTT library, available to Chicken. But *chicken-bind -follow-include mosquitto.h* gives me a practically empty 'mosquitto.scm' file: 8< ;;; GENERATED BY CHICKEN-BIND FROM mosquitto.h (begin) ;;; END OF FILE 8< (this is the 'mosquitto.h' file: https://github.com/eclipse/mosquitto/blob/master/lib/mosquitto.h) Any idea, where I should look for the solution of this? /Christoph -- Christoph Lange ___ Chicken-users mailing list Chicken-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-users