Re: Regarding the implementations of PicoLisp
Hi Will, thanks for you long explanation! On Fri, May 9, 2014 at 7:59 PM, Rick Lyman lyman.r...@gmail.com wrote: flow.c:60:37: error: fields must have a constant size: 'variable length array in structure' extension will never be supported struct {any sym; any val;} bnd[length(x)]; On Sun, May 11, 2014 at 12:19:46PM -0700, William Cushing wrote: The normal thing to do, when encountering this problem, is to replace the variable length arrays with pointers. As in char foo[] - char * foo. Correct. For a LISP VM, for specifically the case of the root type of the VM, it might make more sense (than making any a typedef for void *) to make any a typedef for a tagged union along the lines of: typedef union { char c; short s; ... } all_builtin_types_type; typedef struct { tag_type tag; all_builtin_types_type value; } any; No. any is in fact a fixed-sized structure: typedef struct cell {// PicoLisp primary data type struct cell *car; struct cell *cdr; } cell, *any; This is not the problem. The problem is that all data sizes (lengths of lists, number of arguments to functions, sizes of various structures) are dynamic in PicoLisp, and determined at runtime. Considering the above case struct {any sym; any val;} bnd[length(x)]; which, btw, could also be typedef'd as typedef struct myStruct { any sym; any val; } myStruct; and then written as myStruct bnd[length(x)]; So for a compiler not supporting variable length arrays and structures, the above statement must be rewritten as myStruct *bnd; ... bnd = malloc(length(x) * sizeof(myStruct)); The problem with this is that is horribly inefficient. The dynamic version myStruct bnd[length(x)]; simply decrements the stack pointer by length(x) * sizeof(myStruct) (which is a single machine instruction!), while the malloc() call involves the whole memory management machinery. And then, don't forget the free() call! If the compiler supports alloca(), then it could be used instead of malloc(), with similar efficiency. But I doubt it will be available, because it requires the same runtime behavior. ♪♫ Alex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: Regarding the implementations of PicoLisp
On Mon, May 12, 2014 at 08:06:57AM +0200, Alexander Burger wrote: The problem with this is that is horribly inefficient. The dynamic version myStruct bnd[length(x)]; simply decrements the stack pointer by length(x) * sizeof(myStruct) (which is a single machine instruction!), while the malloc() call involves the whole memory management machinery. BTW, this inability of C to properly support stack manipulations was the main reason to write the 64-bit version of PicoLisp in assembly. As I wrote this in http://software-lab.de/doc64/README The reasons to choose assembly language (instead of C) were, in decreasing order of importance: 1. Stack manipulations Alignment to cell boundaries: To be able to directly express the desired stack data structures (see doc64/structures, e.g. Apply frame), a better control over the stack (as compared to C) was required. Indefinite pushs and pops: A Lisp interpreter operates on list structures of unknown length all the time. The C version always required two passes, the first to determine the length of the list to allocate the necessary stack structures, and then the second to do the actual work. An assembly version can simply push as many items as are encountered, and clean up the stack with pop's and stack pointer arithmetics. ... Pushing and popping data of unknown length is at the heart of the PicoLisp interpreter. It is done all the time. Note that even with arrays of variable length, as in the discussed case: myStruct bnd[length(x)]; it is still not optimal, because the interpreter has to call length() on the list first, before actually processing it. The list needs to be traversed twice. In a language with proper stack control, you can simply call 'push' in the loop doing the processing. ♪♫ Alex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: Regarding the implementations of PicoLisp
And this is an excellent example of PicoLisp going the extra mile. Instead of handling C as the lowest abstraction, going to the actual machine. I imagine other interpreted languages could be faster if designed with this attention to detail. On 12 maj 2014 08:24:13 CEST, Alexander Burger a...@software-lab.de wrote: On Mon, May 12, 2014 at 08:06:57AM +0200, Alexander Burger wrote: The problem with this is that is horribly inefficient. The dynamic version myStruct bnd[length(x)]; simply decrements the stack pointer by length(x) * sizeof(myStruct) (which is a single machine instruction!), while the malloc() call involves the whole memory management machinery. BTW, this inability of C to properly support stack manipulations was the main reason to write the 64-bit version of PicoLisp in assembly. As I wrote this in http://software-lab.de/doc64/README The reasons to choose assembly language (instead of C) were, in decreasing order of importance: 1. Stack manipulations Alignment to cell boundaries: To be able to directly express the desired stack data structures (see doc64/structures, e.g. Apply frame), a better control over the stack (as compared to C) was required. Indefinite pushs and pops: A Lisp interpreter operates on list structures of unknown length all the time. The C version always required two passes, the first to determine the length of the list to allocate the necessary stack structures, and then the second to do the actual work. An assembly version can simply push as many items as are encountered, and clean up the stack with pop's and stack pointer arithmetics. ... Pushing and popping data of unknown length is at the heart of the PicoLisp interpreter. It is done all the time. Note that even with arrays of variable length, as in the discussed case: myStruct bnd[length(x)]; it is still not optimal, because the interpreter has to call length() on the list first, before actually processing it. The list needs to be traversed twice. In a language with proper stack control, you can simply call 'push' in the loop doing the processing. ♪♫ Alex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe -- Skickat från min Android-telefon med K-9 E-post. Ursäkta min fåordighet.
Re: Regarding the implementations of PicoLisp
And this is an excellent example of PicoLisp going the extra mile. Instead of handling C as the lowest abstraction, going to the actual machine. I imagine other interpreted languages could be faster if designed with this attention to detail. Exactly! Thank you Alex, for the insightful explanation. On 12 maj 2014 08:24:13 CEST, Alexander Burger a...@software-lab.de wrote: On Mon, May 12, 2014 at 08:06:57AM +0200, Alexander Burger wrote: The problem with this is that is horribly inefficient. The dynamic version myStruct bnd[length(x)]; simply decrements the stack pointer by length(x) * sizeof(myStruct) (which is a single machine instruction!), while the malloc() call involves the whole memory management machinery. BTW, this inability of C to properly support stack manipulations was the main reason to write the 64-bit version of PicoLisp in assembly. As I wrote this in http://software-lab.de/doc64/README The reasons to choose assembly language (instead of C) were, in decreasing order of importance: 1. Stack manipulations Alignment to cell boundaries: To be able to directly express the desired stack data structures (see doc64/structures, e.g. Apply frame), a better control over the stack (as compared to C) was required. Indefinite pushs and pops: A Lisp interpreter operates on list structures of unknown length all the time. The C version always required two passes, the first to determine the length of the list to allocate the necessary stack structures, and then the second to do the actual work. An assembly version can simply push as many items as are encountered, and clean up the stack with pop's and stack pointer arithmetics. ... Pushing and popping data of unknown length is at the heart of the PicoLisp interpreter. It is done all the time. Note that even with arrays of variable length, as in the discussed case: myStruct bnd[length(x)]; it is still not optimal, because the interpreter has to call length() on the list first, before actually processing it. The list needs to be traversed twice. In a language with proper stack control, you can simply call 'push' in the loop doing the processing. âªâ« Alex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe -- Skickat frÃ¥n min Android-telefon med K-9 E-post. Ursäkta min fÃ¥ordighet. -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: Regarding the implementations of PicoLisp
On Mon, May 12, 2014 at 8:06 AM, Alexander Burger a...@software-lab.de wrote: Hi Will, thanks for you long explanation! Right, and thanks to all for this interesting journey in the internals. The problem with this is that is horribly inefficient. I'm interested by a clang compatible version, just to see what emscripten will make of it. For the sake of the experience I'm gonna try anyway. The problem is it could take me some time. If someone fluent in C could try, that would be interesting!!! (sorry, I haven't much more to offer). By the way, is the source of miniPicoLisp in the repo at code.google? https://code.google.com/p/picolisp/source/browse/ chri -- http://profgra.org/lycee/ (site pro) http://delicious.com/profgraorg (liens, favoris) https://twitter.com/profgraorg On Mon, May 12, 2014 at 10:59 AM, andr...@itship.ch wrote: And this is an excellent example of PicoLisp going the extra mile. Instead of handling C as the lowest abstraction, going to the actual machine. I imagine other interpreted languages could be faster if designed with this attention to detail. Exactly! Thank you Alex, for the insightful explanation. On 12 maj 2014 08:24:13 CEST, Alexander Burger a...@software-lab.de wrote: On Mon, May 12, 2014 at 08:06:57AM +0200, Alexander Burger wrote: The problem with this is that is horribly inefficient. The dynamic version myStruct bnd[length(x)]; simply decrements the stack pointer by length(x) * sizeof(myStruct) (which is a single machine instruction!), while the malloc() call involves the whole memory management machinery. BTW, this inability of C to properly support stack manipulations was the main reason to write the 64-bit version of PicoLisp in assembly. As I wrote this in http://software-lab.de/doc64/README The reasons to choose assembly language (instead of C) were, in decreasing order of importance: 1. Stack manipulations Alignment to cell boundaries: To be able to directly express the desired stack data structures (see doc64/structures, e.g. Apply frame), a better control over the stack (as compared to C) was required. Indefinite pushs and pops: A Lisp interpreter operates on list structures of unknown length all the time. The C version always required two passes, the first to determine the length of the list to allocate the necessary stack structures, and then the second to do the actual work. An assembly version can simply push as many items as are encountered, and clean up the stack with pop's and stack pointer arithmetics. ... Pushing and popping data of unknown length is at the heart of the PicoLisp interpreter. It is done all the time. Note that even with arrays of variable length, as in the discussed case: myStruct bnd[length(x)]; it is still not optimal, because the interpreter has to call length() on the list first, before actually processing it. The list needs to be traversed twice. In a language with proper stack control, you can simply call 'push' in the loop doing the processing. ♪♫ Alex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe -- Skickat från min Android-telefon med K-9 E-post. Ursäkta min fåordighet. -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe -- http://profgra.org/lycee/ (site pro) http://delicious.com/profgraorg (liens, favoris) https://twitter.com/profgraorg -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: Regarding the implementations of PicoLisp
Hi Christophe, I'm interested by a clang compatible version, just to see what emscripten will make of it. For the sake of the experience I'm gonna try anyway. Nice! Probably much more interesting (and useful) would be to port the pil64 assembler to clang. I considered that initially, but then gave up for the same reasons (no control over stack and CPU flags). By the way, is the source of miniPicoLisp in the repo at code.google? https://code.google.com/p/picolisp/source/browse/ No, miniPicoLisp is separate. In fact, I had already stopped maintaining it (after pil64 was out), but then went on to keep it in sync with all relevant changes and fixes. So it is only available as a tarball at http://software-lab.de/miniPicoLisp.tgz ♪♫ Alex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: Regarding the implementations of PicoLisp
On Mon, May 12, 2014 at 5:40 AM, Christophe Gragnic christophegrag...@gmail.com wrote: I'm interested by a clang compatible version, just to see what emscripten will make of it. For the sake of the experience I'm gonna try anyway. chri, I'm also interested in a emscripten compiled minPicoLisp. I've been itching to try out emscripten and I think having miniPicoLisp available on all platforms without building could help PicoLisp adoption. I was able to compile miniPicoLisp on windows under clang. I basically just replaced all instances of variable array initialization, such as: struct {any sym; any val;} bnd[length(x = car(expr))+3]; with //TODO struct {any sym; any val;} bnd[100]; It builds and runs. I don't see any obvious consequences yet. I would have assumed something like this would fail: (setq Z (make (for N 120 (link N I don't have emscripten geared up on this machine to try to compiling with it. I figure it's not that far off though as a proof of concept. Alex, is there a reasonably safe upper bounds that can be used instead of it being determined dynamically? (http://clang.llvm.org/compatibility.html#vla) 1. replace the variable length array with a fixed-size array if you can determine a reasonable upper bound at compile time; sometimes this is as simple as changing int size = ...; to const int size = ...; (if the initializer is a compile-time constant); I can envision a neat tool that lets us share snippets and even run code directly from rosettacode. GNU APL.js supports pasting code to it's emscripten compiled version of apl: http://baruchel.hd.free.fr/apps/apl/#code=5%205%20%E2%8D%B4%20%E2%8D%B310%0A
Re: Regarding the implementations of PicoLisp
Hi Joe, struct {any sym; any val;} bnd[100]; ... It builds and runs. I don't see any obvious consequences yet. I would have assumed something like this would fail: (setq Z (make (for N 120 (link N This doesn't actually use any variable length array. You may instead try (be sure to set the stack size to unlimited before): $ ulimit -s unlimited $ ./pil + : (apply + (need 100 1)) - 100 If you don't exceed the fixed limit, but use a rather large fixed size like the 100 above, you'll consume a large amount of stack space for each function call. The 100 structures will occupy (with 4-byte-pointers for 'sym' and 'val' on a 32-bit system) 800 bytes on each recursion. Alex, is there a reasonably safe upper bounds that can be used instead of it being determined dynamically? Hmm, what is safe? In any case you use the generality of the language, the Unlimited design objective of PicoLisp. You can never be sure that you don't exceed one of these fixed sizes. This applies not only to the number of function arguments, where you can be rather sure that you don't write a function with 100 arguments, but where it easily happens in the 'apply' family of functions (mapping), and also everywhere an environment (a closure) in a list is handled (e.g. in 'job' or 'bind'). Another significant situation is the frequent code snippet char nm[bufSize(y)]; bufString(y,nm); where strings (symbol names) are handled. You cannot be sure that the string is not several megabytes in length (e.g. if you read in a whole file). But reserving several megabytes on the stack is not an option. ♪♫ Alex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: Regarding the implementations of PicoLisp
Hi Alex, Thanks for the reply and the details. On Mon, May 12, 2014 at 9:56 AM, Alexander Burger a...@software-lab.dewrote: Alex, is there a reasonably safe upper bounds that can be used instead of it being determined dynamically? Hmm, what is safe? In any case you use the generality of the language, the Unlimited design objective of PicoLisp. You can never be sure that you don't exceed one of these fixed sizes. The Unlimited design of PicoLisp is incredibly powerful. To clarify, I was seeking some guidance on a reasonable upper limit on variable length arrays that doesn't significantly handicap the language for demoing in an emscripten environment. The proper solution is likely to use malloc/free but that would introduce additional effort/complexity that might be unnecessary for a proof of concept. I sometimes prefer hacking a small change just to see if it's possible before letting myself go down a rabbit hole. It sounds like the reasonable upper limit might depend on the function. I think I may have changed approximately 8 places that use the sym/val structure. I'll take a closer look. I was hopeful that there might have been a quick answer that will work the vast majority of use cases with little or no impact on run time and memory.
Re: Regarding the implementations of PicoLisp
On Mon, May 12, 2014 at 2:16 PM, Joe Bogner joebog...@gmail.com wrote: I was able to compile miniPicoLisp on windows under clang. I basically just replaced all instances of variable array initialization, such as: struct {any sym; any val;} bnd[length(x = car(expr))+3]; with //TODO struct {any sym; any val;} bnd[100]; It builds and runs. I didn't have this luck. I just set up a repository on github (Alex being OK) and reported my issue here: https://github.com/Grahack/minipicolisp/issues/1 Could you show me your source files? Could you try with my source files? Or tell me the differences? Would it be easy for you to be granted commit access? On Mon, May 12, 2014 at 4:31 PM, Joe Bogner joebog...@gmail.com wrote: […] The proper solution is likely to use malloc/free but that would introduce additional effort/complexity that might be unnecessary for a proof of concept. I sometimes prefer hacking a small change just to see if it's possible before letting myself go down a rabbit hole. The same for me. My goal being beyond the proof of concept, but not very far though. chri -- http://profgra.org/lycee/ (site pro) http://delicious.com/profgraorg (liens, favoris) https://twitter.com/profgraorg -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: Regarding the implementations of PicoLisp
I added my changes to this repo: https://github.com/joebo/miniPicoLisp This commit has everything needed to build on clang on windows: https://github.com/joebo/miniPicoLisp/commit/e34b052bc9c8bd8fa813833294a5830a69ffb56e I'm using: C:\Users\jbogner\Downloads\miniPicoLisp\srcclang -v clang version 3.4 (198054) Target: i686-pc-mingw32 Thread model: posix Selected GCC installation: And my make came from http://sourceforge.net/projects/win-bash/ I am happy to answer any questions/help further On Mon, May 12, 2014 at 11:50 AM, Christophe Gragnic christophegrag...@gmail.com wrote: On Mon, May 12, 2014 at 2:16 PM, Joe Bogner joebog...@gmail.com wrote: I was able to compile miniPicoLisp on windows under clang. I basically just replaced all instances of variable array initialization, such as: struct {any sym; any val;} bnd[length(x = car(expr))+3]; with //TODO struct {any sym; any val;} bnd[100]; It builds and runs. I didn't have this luck. I just set up a repository on github (Alex being OK) and reported my issue here: https://github.com/Grahack/minipicolisp/issues/1 Could you show me your source files? Could you try with my source files? Or tell me the differences? Would it be easy for you to be granted commit access? On Mon, May 12, 2014 at 4:31 PM, Joe Bogner joebog...@gmail.com wrote: […] The proper solution is likely to use malloc/free but that would introduce additional effort/complexity that might be unnecessary for a proof of concept. I sometimes prefer hacking a small change just to see if it's possible before letting myself go down a rabbit hole. The same for me. My goal being beyond the proof of concept, but not very far though. chri -- http://profgra.org/lycee/ (site pro) http://delicious.com/profgraorg (liens, favoris) https://twitter.com/profgraorg -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subjectUnsubscribe
Re: Regarding the implementations of PicoLisp
On Mon, May 12, 2014 at 11:50 AM, Christophe Gragnic christophegrag...@gmail.com wrote: I just set up a repository on github (Alex being OK) and reported my issue here: https://github.com/Grahack/minipicolisp/issues/1 I think the main difference is your Makefile https://github.com/Grahack/minipicolisp/commit/15a72e16b097336c3a1d3b4092f3656509183308 Instead of: clang $*.c I'm doing this: $(CC) -w -c $*.c The -w suppresses warnings and then linking with: c:/Progra~2/LLVM/bin/clang.exe -o picolisp $(picoFiles:.c=.o)
Re: Regarding the implementations of PicoLisp
Why not alloca()? El 12 May 2014, a las 16:31, Joe Bogner joebog...@gmail.com escribió: The proper solution is likely to use malloc/fre -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: Regarding the implementations of PicoLisp
Ah, I read too quickly, didn't notice/realize the length(...) subexpression was the variable part. If you don't have alloca, and you don't want to use assembler, and you don't want the overhead of malloc/free, and you don't want to, or literally can't, risk demons flying out of your nose: typedef char byte; byte hack[HACK_SIZE]; // hack is meant to remind one of stack byte * hack_ptr = hack+(HACK_SIZE-1); void * hack_alloc(size_t num_bytes) { return hack_ptr -= num_bytes; } void * hack_dealloc(size_t num_bytes) { return hack_ptr += num_bytes; } // if you don't trust your compiler to inline one-liners (what?! get a new compiler, and/or file a bug report, but if all else fails): //#define hack_alloc(n) ((void *)hack_ptr -= n) //#define hack_dealloc(n) ((void *)hack_ptr += n) // you can call these push and pop or something similar instead. // (if you get annoyed by the type casting, you can change to a byte * interface instead of a void * interface, ofc.) // Importantly though, dealloc isn't free; you have to track lengths separately all the way from alloc to dealloc, // which automation is a (very) small part of the overhead of malloc/free. -Will P.S. If you're feeling dirty, and you know exactly how your compiler allocates call frames, sometimes you can capture the address of your last local var, as in: int sp; #ifdef STACK_DOWNWARDS #define salloc(n) (sp -= n) #define sdealloc(n) (sp += n) #else //... #endif void my_proc(my_vec x) { int i,j,k; sp = k; salloc(length(x)); //... sdealloc(length(x)); } But this will surely fail (thankfully at compile-time) for emscripten/clang. Also, it will likely fail (at runtime, resulting in 'random' code execution) if the compiler introduces temporaries, or you call subroutines. A similar trick is: void * get_sp_here(int bogus) { return bogus; } but, again, this relies on implementation-defined behavior, which will again be rejected by emscripten/clang. On Mon, May 12, 2014 at 10:03 AM, Joe Bogner joebog...@gmail.com wrote: On Mon, May 12, 2014 at 11:50 AM, Christophe Gragnic christophegrag...@gmail.com wrote: I just set up a repository on github (Alex being OK) and reported my issue here: https://github.com/Grahack/minipicolisp/issues/1 I think the main difference is your Makefile https://github.com/Grahack/minipicolisp/commit/15a72e16b097336c3a1d3b4092f3656509183308 Instead of: clang $*.c I'm doing this: $(CC) -w -c $*.c The -w suppresses warnings and then linking with: c:/Progra~2/LLVM/bin/clang.exe -o picolisp $(picoFiles:.c=.o)
Re: Regarding the implementations of PicoLisp
On Mon, May 12, 2014 at 7:03 PM, Joe Bogner joebog...@gmail.com wrote: I think the main difference is your Makefile Instead of: clang $*.c I'm doing this: $(CC) -w -c $*.c The -w suppresses warnings Great. It works now. I fixed the warnings and didn't add the -w flag though
Re: Regarding the implementations of PicoLisp
Hi Will, If you don't have alloca, and you don't want to use assembler, and you don't want the overhead of malloc/free, and you don't want to, or literally can't, risk demons flying out of your nose: typedef char byte; byte hack[HACK_SIZE]; // hack is meant to remind one of stack byte * hack_ptr = hack+(HACK_SIZE-1); void * hack_alloc(size_t num_bytes) { ... Well, nice try, but it is not helpful here. Basically you are implementing you own malloc(), which is still far away from a single-instruction push, pop or stack arithmetic. And it is worse than that, because it violates the Unlimited principle even more, with the constant HACK_SIZE. If you make 'hack' variably sized (allocated by malloc() or realloc() again?), you get a full blown self-made memory manager. So why the trouble? :) ♪♫ Alex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe