Re: PEG Patches
Hello again, I was about to do this, and then I discovered that it wouldn't work, because there are a few special case PEGs that don't make sense as macros. Specifically, in the context of a PEG, we interpret strings as matching themselves, and those can't be made into macros. So I went ahead and implemented a simple way to extend peg-sexp-compile. It turned out to be much less difficult than I was afraid of. The first attached patch adds the interface to (ice-9 peg codegen) and changes most of the functions there to use it, and also adds some documentation in the PEG Internals section. The second one updates (ice-9 peg string-peg) to use it as well, and gets rid of peg-extended-compile from peg.scm since it's no longer needed. I wrote the patches on top of the last two that I sent, because those included some cleanups that I wanted to keep. Noah On Tue, Mar 29, 2011 at 9:20 AM, Andy Wingo wi...@pobox.com wrote: On Tue 29 Mar 2011 14:47, Noah Lavine noah.b.lav...@gmail.com writes: (define-peg-matcher and cg-and) That's doable. But if we're going to choose what to do entirely based on the first element of the list, then we could also just not define peg-sexp-compile at all and make each of the code generation functions into macros. How does that sound? Good idea. Sounds great to me! Andy -- http://wingolog.org/ 0001-Extensible-PEG-Syntax.patch Description: Binary data 0002-Update-String-PEGs.patch Description: Binary data
Re: GSoC 2011
Hello, Your ideas sound neat, but there are a few things I am not familiar with. On Thu, Mar 31, 2011 at 10:31 AM, Diogo F. S. Ramos diogo...@gmail.com wrote: What do you want to do in this area? There is important work to do with introspection, but we would need to see your ideas and your code. You can see some of the code here: http://gitorious.org/~diogofsr/guile-gir/didi-guile-gir The thing is: I don't know what to do. Maybe some history will clear things. I've started trying to port librepository to guile from the ground up, using just C. While doing so people at #introspection point me to the great port of zeenix (guile-gir), as yourself. What does librepository do? I contacted zeenix and he was very kind to help me with it, even commenting on my commits. It was very cool of him. After some time, and talking to you, it was suggest that an Introspection implementation should use (gnome gobject) and the pages from Introspection itself says that it is a good idea to use a previous gobject binding, as python does it. So I started playing with it. When you talk about introspection, are you talking about introspecting on GObjects, or all Guile objects? (Guile has its own object system. If I understand correctly, all GObjects can be Guile objects, but not all Guile objects are GObjects.) Some more time, I thought that would be a good idea to go full power with the dynamic ffi, so there I went. Some more time and IRC talk, rotty introduced to me his great sbank and after some more talk at #guile it was pointed that sbank is the direction that guile should go for an Introspection binding. So, as you can see, I don't have a clear vision on what has and needs to be done or how. I would happily give it a shot, but I need some kind of guidance. According to some Gnome webpage Google found, GLib introspection is intended to make it easy to wrap GObject objects in higher-level languages. If I understand correctly, you wish to do this for Guile? That sounds like a good thing to do, if so. Noah
Re: GSOC Advice
Hello, Thanks for the advice! I am in America (in the Eastern Time Zone, to be precise). I'm a relatively new committer, which is why I haven't been advertising for people to mentor, but I would be happy to help with anything I could. Noah On Mon, Apr 4, 2011 at 5:50 PM, Phil theseaisinh...@gmail.com wrote: If anyone was thinking about participating in GSOC this year I have a couple tips: - Do it, it's a great program and the Guile team is great to work with. - If you can, develop against the stable release of Guile. - (for the Guile team) I'm not sure if we have any students in America this year but if we do, it'd be great if a mentor would step up. The turnaround time on communications due to timezone differences was a little problematic for me last year. Good luck! Phil.
Re: Problem with GCC as a Scheme compiler: tail calls
Ah, I see. The problem is quite significant. I found this page in the GCC internals documentation: http://gcc.gnu.org/onlinedocs/gccint/Tail-Calls.html. It seems to suggest that we would need to make our own calling convention. So I guess the question is, is it worthwhile for us to fix GCC so it can handle tail calls, or should we use another code generation library that already deals with them? I must admit I still really like the idea of using GCC, because it has a lot of good optimizations, and I also like the idea of working with other groups rather than making yet another compiler. However, I guess the next thing to do is to ask on the GCC mailing list if they're interested in tail call patches, and how hard those would be to make. I realize however that the GSoC deadline is tomorrow, and that Paul is probably wondering whether he should submit a proposal for an AOT compiler or not. So I suggest that an AOT compiler is a good project, and the first step should be to figure out if it's possible for us to use GCC. What do people think of this? (Also note: I know Paul outside of this mailing list, so my opinion is not objective.) Noah On Thu, Apr 7, 2011 at 10:25 AM, Mark H Weaver m...@netris.org wrote: Noah Lavine noah.b.lav...@gmail.com writes: There is one _very_ serious problem with using GCC to compile Scheme, or at least there was the last time I researched this issue: tail calls. I might be wrong about this, but I thought that GCC supported multiple calling conventions, with the user telling GCC which one to use (cdecl, fastcall, possibly others). If so, it must have some way to represent that in its intermediate representations. We might be able to add our own calling convention. I also know that GCC has an --optimize-sibling-calls argument, which will make it do something similar to at least some regular C calls. I'm not sure if it's possible, but maybe we could arrange for whatever code transformation implements this to be run on every Scheme call. Please read section 3.3 of the thesis I referenced: http://users.cecs.anu.edu.au/~baueran/thesis/ There you will find the list of conditions required for tail calls to be optimized by GCC circa 2003. Among other things, it includes: * The caller's stack frame must be empty (i.e. no stack-allocated local variables or temporaries). * No overlapping arguments (i.e. if it's too difficult to rearrange the input arguments on the stack to match what the next function expects, gcc will bail on the tail call) * No position-independent code on several platforms, including x86 (i.e. no tail calls in shared libraries) * No indirect calls (i.e. no tail calls to function pointers) * No setjmp or longjmp * Sibling call epilogue must be defined (i.e. many targets aren't supported) * Additional machine-independent constraints (i.e. many targets impose additional restrictions) The end result of all of these restrictions is that tail calls can only be optimized by GCC in extremely simple cases like fibonacci. Some of these restrictions may have been lifted since 2003, but I'm reasonably sure the problem has not been solved satisfactorily. If it had been, it would've been very big news in the functional programming language implementors community, since this has been a long-standing problem. However, what I find more persuasive than any paper is that fact that I've never found an implementation of a high-level programming language based on GCC that manages to support tail calls without doing some nasty hacks. Lots of smart people have worked on this, but to my knowledge, no one has found a good solution. The solutions I'm aware of are: * Cheney on the MTA, as is done by Chicken * Trampoline: instead of calling functions directly, return the address of the next function back to the trampoline loop. * Don't do function calls at all. Instead use inline asm to jump to the inside of the next function, passing arguments via register globals. GHC did this at one point. GCC's problems in this area are almost certainly the main reason so many functional programming language implementors have resorted to writing their own native code generators from scratch. If anyone knows of a recent breakthrough in this area, please speak up. Best, Mark
Re: [PATCH] Fix the R6RS exact-integer-sqrt and import into core guile
Hello, 1. I think we shouldn’t augment the C API unless strictly necessary, because the idea is to write Scheme, not C, and because of the maintenance cost. Why would we not want to augment the C API? It seems like a natural complement to augmenting the Scheme API, since Guile is a C library *and* a Scheme implementation. Noah
Re: Problem with GCC as a Scheme compiler: tail calls
Hello, Regarding GCC, I have spoken to GCC folk, and they are not averse to making GCC into a more modular thing. There are obvious licensing concerns, but these are surmountable: Guile and GCC could work together somehow. The problem, as I understood it last year, was that GCC doesn't know exactly what abstractions are necessary. For that we need to get our own house in order, write an AOT compiler for one architecture, then go to GCC and say it would be nice if we had these features, or something like that. People's experiences with LLVM are probably helping here, to define the problem space, but we need something simple to push to GCC folk, like a simple less-than-Scheme implementation with only fixnum arithmetic or something, both AOT and with a REPL. Neat! I had an idea of how this transition could go that is different, but I think might turn out to involve a lot of the same steps as yours. I was imagining that we would start with a compiler that was almost all custom code, except that the backend is GCC. In the most extreme case, maybe GCC would act as a big assembler, but I think we should be able to use a few more of its features than that. Once we have that, I imagine that we could integrate our code into GCC over time, either merging our stuff with existing similar code already in GCC or adding new things. If I understand correctly, you were thinking that we would write our compiler with some other backend, figure out what changes we want GCC to make, then switch to using GCC once they're ready. Right? I was also imagining that we could start with a batch-compiling AOT compiler like current GCC, and then make it more dynamic as GCC became more modular. It seems like you want to start out with something that could be called in the background by the runtime. Is that right? I was thinking that if we could teach GCC to compile tail calls, then we could use quite a bit of its infrastructure without any further changes, because we could make a Scheme procedure compile to a GCC procedure, and in fact implement all Scheme control flow stuff as a bunch of GCC procedures. We would still have to implement Scheme datatypes ourselves, but that would still give us a decent amount of integration to begin with. What do you think? Noah
Re: soc mentor?
Hello, I would be happy to mentor for SoC! I think it's a great program. I can also promise that I'll have a lot more time to work on Guile stuff over the summer than I do now, which should be perfect for SoC. :-) My only questions are about logistics. Do I have to do any paperwork or something like that to mentor? Where can I see the proposals and give feedback? And how many spots does Guile have? Also, if we have three proposals for CPAN, then certainly two people should do something else, but how do we decide which two people? Paul and Diego both posted a list of proposals to the list before settling on their final one, so I imagine we could ask them to choose a different one from their list. Noah On Tue, Apr 12, 2011 at 4:44 PM, Andy Wingo wi...@pobox.com wrote: Hi Noah, Are you interested in mentoring for the SoC this year? We currently have four proposals, they all look fine, but three of them are for the CPAN-alike. We need to shake up those students. But if we were to get more slots -- totally unknown at this point, btw -- we would need more mentors, and more projects. For example Diogo might be interested in guile-gnome stuff, and perhaps Paul or Kentarô might be interested in something else as well. Perhaps you could mentor a CPAN. Dunno. Basically the deadline for project applications has passed, but students can still modify their proposals, so actually they can change to totally different proposals. Anyway. Let me know your thoughts. Copying guile-devel so this is happening in public; and actually if any guile-devel reader feels interested in mentoring, and you are active in Guile, let me know and we can talk about it. But we do have to be somewhat quick about it. Cheers, Andy -- http://wingolog.org/
A Modest Proposal
Hello Guile and Clisp developers, I'm writing to talk about vague big-picture ideas, but please bear with me for a minute, because I think this could be useful. I noticed in the recent GNU Summer of Code applications (I'm a mentor for Guile) that CLisp wants to become embeddable, and embed into Emacs as a first test case. That would make Clisp an embeddable Lisp implementation with a bytecode-interpreting virtual machine, compiler, interpreter, debugger, etc. A very cool thing. Clisp developers might not be aware that that is exactly what GNU Guile is - a bytecode-interpreting virtual machine with a compiler, interpreter, debugger, and collection of useful tools. Guile is already embeddable, because that was one of Guile's original goals, but the difference is not so big. Guile also has a summer of code projects to embed itself into Emacs, in fact. It seems to me that it is time for Guile and CLisp to consider working together, because if we don't work together then I think we're both going to do the same work twice, separately. This depends greatly on what CLisp's goals are as a project, and I do not know those. Maybe you have very different goals than Guile, in which case we might not gain anything by working together. But I do have a feeling that we are both evolving towards the same place, and if so, I think it would be nice to cooperate some along the way. Noah
Re: Avoiding variable clashes
I think that mechanism is all that Guile uses at present. However, it should be general enough to resolve all situations where variables of the same name refer to different entities, assuming you set up the environments correctly. Are you planning on implementing a theorem prover for Guile? That would be cool. On Wed, Apr 13, 2011 at 11:46 AM, Hans Aberg haber...@telia.com wrote: On 13 Apr 2011, at 17:27, Andy Wingo wrote: What method is Guile using to avoid substitution variable clashes (de Bruijn numbers, combinators, etc.)? Each lexical variable is given a fresh name (a gensym) when it is introduced. The expander keeps an environment as to what name maps to what gensym, and residualizes the gensym in the lexical reference or assignment. See The Scheme Compiler in the manual, for more. I am thinking about it in the context of other types of binders, that satisfies the alpha-rule, but not the beta, useful in math (like theorem provers). Has that been discussed? Sorry, I don't know what you mean. References? There is an article here: http://en.wikipedia.org/wiki/Variable_binding_operator Hans
Re: A Modest Proposal
Hello, I think we should first compare the virtual machines. If no obvious impossibility is observed, then perhaps modifying the compiler of clisp to generate guile VM code would be an easy path to obtain a CL implementation running on guile VM. (This would disable the interpreter in clisp, since it is implemented in C). Or, vice-versa, if the VM of clisp has advantages (but if guile has continuations, this would be lacking in the clisp VM). In general, there may be a need for a very good lisp virtual machine to run and integrate lisp code in general (CL, various schemes, and other sorts of lisp-like languages, we could include perhaps implementations of python, ruby, smalltalk, javascript, etc). From well afar, it looks like the JVM is not good enough for lisp (ABCL, Clojure, seem to have some difficulties to implement some basic lisp features on the JVM). I think what you're saying may be good, but I'm very hesitant to move ahead without lots of people agreeing. I would be very afraid that we would have friction as two groups of people who hadn't known each other previously tried to work together. For instance, what if some of us want to move the VM in one direction and others in another direction? Or what if the groups have different standards for when we make changes to the VM? On the other hand, there could be significant advantages. I see two big ones: first, that interoperability means that when one of us scratches an itch, all of us benefit, and second, that users of embedded interpreters would have to learn less. If the GNU project has only one embedded interpreter, then once you know how to use that, you can modify all GNU programs. So I think it might be good in the end if we all worked together, but I think it is a rather big step. What do other people think? Finally, to answer a few or your concrete points: This VGLVM would: - be embeddable in applications, - include a FFI to native code, - be more performant than the JVM, - be natively multithreaded, - have a good real-time multithreaded garbage collector, - possibly have a JIT and/or a retargetting compiler, - allow easy (FFI-less) communication between the different languages targetting it (for example, when we run pseudo, we can call CL function from scheme and scheme functions from CL).. Guile already has 1, 2, and 4, and we are planning both parts of 6 for the near future. I think 7 would come automatically for languages that shared a VM. 3 and 5 I don't know anything about. It's also worth noting that Guile is already set up to compile multiple languages to its VM, so if we decide to go that route, some of the infrastructure is already there. Noah
Re: Avoiding variable clashes
The fact that we represent different variables as gensyms is an implementation detail, and unimportant. (In fact, I might like to change that detail at some point to get better error messages, but it's not high on my to-do list.) What are you referring to here? It's just part of my mental image of how variables work. I think of the key idea as there being different variable objects, or perhaps slots, even for variables referenced by the same symbol. We implement that by assigning a unique gensym to each variable object, I think, but you could assign a unique anything to each variable object and it would work just as well. Is that an accurate picture of how Guile works? When I said I might like to change that, I was thinking of a thread a while ago about how to handle top-level variables defined in macros. It seems to me that instead of having a gensym object for each separate variable, you might want a pair of (symbol . macro), where symbol is the name of the variable and macro is the macro expansion it came from. Then you could print stack traces or error messages with information like variable x (from the third expansion of (while) I haven't really thought through the idea much though. As I said, it's not high on my to-do list.
PEG CHanges
Hello all, I just pushed some changes to the wip-peg branch. They eliminate the last remaining circularity in the module imports and then add everything to the Makefile. It passes all tests on my machine. This is my first time pushing something, so I hope I did it correctly. If the patches aren't good I assume we can revert them. Noah
Re: A Modest Proposal
These are all good points. Integrating different languages is hard. Quite frankly, I am not yet sure that sharing a VM would be a good idea for us, but as we talk more about it it is seeming more and more reasonable. I sent my original message because I saw that CLisp was becoming an embeddable lisp interpreter with an architecture similar to Guile's, and it seemed to me that the two projects were so similar that they might be able to collaborate productively. I think the idea of moving to a common VM might work, because Scheme and Common Lisp are so much closer than even Scheme and Python, or two other similar languages. I think almost all of the semantics would map nicely from one language to the other (although I haven't thought about it very hard yet). It would be even easier with Guile than with another Scheme, because Guile has already worked on integrating ELisp, which shares some features with CL (i.e. uses nil as #f and '()). However, I haven't thought hard about integrating Guile and CL yet. On the surface, though, I think it could have advantages not necessarily for us, but for other GNU projects and users, which I outlined in a previous email. What do other people think of the idea? Noah On Wed, Apr 13, 2011 at 10:06 PM, William ML Leslie william.leslie@gmail.com wrote: On 14 April 2011 01:23, Pascal J. Bourguignon p...@informatimago.com wrote: In general, there may be a need for a very good lisp virtual machine to run and integrate lisp code in general (CL, various schemes, and other sorts of lisp-like languages, we could include perhaps implementations of python, ruby, smalltalk, javascript, etc). From well afar, it looks like the JVM is not good enough for lisp (ABCL, Clojure, seem to have some difficulties to implement some basic lisp features on the JVM). The problem of language integration has become common enough that we already have the opportunity to make some observations about what did work, and what didn't. A concerted approach to working with distinct object spaces / models would be a mammoth task, it is entirely a different undertaking to integrating two particular languages. The particular issue I find is that the mechanisms for defining how two object spaces interact don't compose well. If you work in a Java environment, for example, you may know how to interact with or create Java objects from JRuby, Kawa, Jython, E and Clojure, but that doesn't let you interact between any of these languages in a sane way. Quick, what is the process for reifying my Jython or Kawa function as a callable in JRuby? This common-denominator approach applies equally well outside these safe bytecode VM environments - there are all sorts of mechanisms for generating FFI code for specific VMs given a C or C++ interface. For the most part, they make you feel like you are writing C or C++. Another thing worth thinking about involves promoting idiomatic usage despite differences in languages. If you've ever implemented a language inside a VM that was not designed for it, you will know what I mean when I ask questions like: 0. Objects in javascript are maps from string keys to any value. If they are to be cast as hashes or dictionaries when passed to a language that allows non-string keys, and the language adds a non-string key, what happens to the javascript object? 1. How are other languages to make or handle multi-valued returns? 2. What happens to cons cells at the boundary between languages where one has immutable cons cells, the other mutable? 3. Further, what happens when a language with linked lists as the default sequence type and one with arrays as the default sequence type interact? Consider, for example, the various options that pymacs provides. 4. What about operators, and the various mechanisms and naming conventions, single/multiple dispatch, AND the guarantees provided by each operator mechanism in any given language? 5. What do you want to do about subclassing and multiple inheritance? In a single-inheritance language like Java, the compiler is free to assume that in if (x instanceof Foo) { } if (x instanceof Bar) { } that if Bar and Foo are distinct concrete classes, and one is not a subclass of the other, the second if can be replaced with an else if. Method resolution order is not consistent between MI languages, either, and may be further complicated by how they are to interact with generic functions. 6. Languages may also make type-related requirements that need to be checked somehow; such as some languages not permitting the initialiser to be called more than once, final methods, c. 7. In what way to generic functions and methods interact? How about classes and typeclasses? 8. What happens when a language that is strict about arity interacts with a language that does not? Does calling a javascript function from python with the wrong number of arguments raise a TypeError, or silently
Re: CPAN-type thing: specifications, wishes, thoughts?
Hello all, I'm afraid this email is coming much later in the planning process than it should, and quite possibly we won't be able to do any of this for SoC, and that's fine with me. But I was thinking about what we could do that would be a killer feature for Guile's CPAN - something that isn't already done by apt-get, dorodango, and every other package manager. One answer is that having a big collection of Guile packages *is* a killer feature, but we could have an apt repository of those if we wanted one. I think the answer is that the killer feature for a large repository of packages is having the ability to painlessly bundle a piece of software you've been writing and all of its dependencies in one easy-to-install thing for users. After all, it's easy when you're developing on your own machine to download a bunch of packages and use them to do whatever you need to do, but if you then want to send that to other people, you've got to collect the source code (or at least .go files) for all of them, put them in a folder, make sure the load-path will work out, and then most importantly, do all of that again every time a new version of one of your dependencies comes out. I think the feature that is missing is the ability to say take my software and package it so that its only external dependency is POSIX, or something similar. The implementation of such a thing isn't especially deep, but I bet there will be a lot of little things that need to be done just right for it to work. I think this could be a part of a package manager that also does the other things Paul was listing. How does this idea sound? Noah On Tue, Apr 12, 2011 at 11:01 PM, Paul Raccuglia pra...@gmail.com wrote: Hi. Since the idea of creating something like CPAN is something folks wants, I figure it would be good to collect as many thoughts and information as possible in one place. (I'd also appreciate links to relevant threads) My thoughts, specifically, were for the client to be fairly similar in function to apt-get. (from the user's perspective) The core commands would be: INSTALL package_name -- queries a server (set in configuration files), gets an absolute link to the package, and a list of dependencies. Downloads the package, and then any dependencies, and takes care of unpacking, verifying, compiling, storing, etc. anything that needs to be done. The package would have to have meta-data attached in some fashion. REMOVE package_name -- just has to remove all the relevant files, and clean up any references to the package in the storage system. May want to check dependencies and alert the user. UPDATE [package_name] -- could be called to check all packages (by not specifying a package name) or to update a specific package. Warn the user of any dependencies that could be broken. My thought was that the package manager could, itself, be a package (but, hopefully, one that would be included by default). I wouldn't imagine that the current code base would need to be modified a whole lot. I was thinking that most of this project could be written in Guile. Does that make sense? Some thoughts I heard were: using stowfs to handle storing the packages intelligently use dorodango to handle most of the acquisition For the server design: I was envisioning a website to navigate packages (like http://packages.debian.org/stable/). My thought is to do everything over HTTP (seems like that would work well with Dorodango). Doesn't seem like a hugely complicated issue? Questions about the server: How would the central repository be maintained? Give trusted developers the ability to make changes to their personal packages at will? How will packages be nominated / selected for inclusion? That's all I can think of, right now. I'm sure there's more but I want to start a discussion. Thanks, Paul R
Re: Guile virtual machine targets
Hello, Let me clarify a bit about Lightning and the stuff I've been doing: Lightning is a very low-level assembler framework. It is embeddable, and Guile could certainly target it, but targeting Lightning would be just a speed enhancement - it wouldn't make us more compatible with anything else. Actually I had decided a while ago that we should look at using GNU Libjit rather than Lightning, mostly because it does register allocation for us, whereas Lightning doesn't. It also provides tail call support, which Lightning does not (I think). Libjit is about the same level of abstraction as LLVM, but with much less optimization ability. We talked about LLVM a bit, but Andy thought that we should avoid it because it is BSD-licensed rather than LGPL (if my memory is correct). However, I don't think optimization is our primary concern right now, because before we start optimizing any sort of low-level representation we would want better optimization for Tree-IL or a similar higher-level representation of Guile code. Noah On Mon, Apr 18, 2011 at 11:43 AM, Ewan Higgs ewan_hi...@yahoo.co.uk wrote: Hi all, With Noah's suggestion of moving the Guile vm in a direction to support CLisp work as well, I became curious about the status of the various Guile back ends (Lightning and Guile-vm). These questions are largely out of personal interest and not leading up to any particular piece of work I had in mind. Is Lightning actually supported? I saw that there is some mention of it on the old CVS page[1], but almost nothing else. I also see that Noah was working on something last year but it doesn't appear to have gotten into the master repo [1]. Is Lightning embeddable? If so, then I guess that work could replace the existing libguile virtual machine. CLisp is currently using Lightning as of 2.45 (2008/02/24)[2]. So if Guile and CLisp are using this then would Noah's interests be satisfied? If Lightning doesn't provide enough features to support work that Guile's vm does, would it make sense to port the work for the Guile-vm to Lightning instead and simply support that back end? If Lightning isn't embeddable, is LLVM? If so, wouldn't that become an attractive target for Guile as it's well documented and has a lot of inertia? Thanks! -Ewan [1] As mentioned here: http://www.gnu.org/s/guile/anon-cvs.html -- though CSV is no longer used. Noah has some work here): https://github.com/noahl/guile-lightning [2] http://www.clisp.org/impnotes.html NB: This was intentionally not cross posted to Clisp developers list.
Re: Just to mention: Why there's still not any guile-2.0 packages in linux distributions?
I don't know about the packaging, but I think the .go files depend on the endianness of the machine but nothing else. So you should be able to move them between different little-endian or big-endian machines (unless I'm wrong). But it seems much safer to me to just compile things. You shouldn't have to do it more than once unless you develop Guile. Good luck, Noah On Mon, Apr 18, 2011 at 9:15 PM, CRLF0710 crlf0...@gmail.com wrote: Guile 2.0 has been there for some time. Why there's still not any guile-2.0 package in linux distributions? Did they have trouble packaging it? By the way, is the .go file portable? Building them(rnrs, ice-9 etc) during compiling guile takes so long... Thanks -- CrLF.0710
Re: CPAN-type thing: specifications, wishes, thoughts?
Well, IIRC, it's not a feature that dorodango does not have ;-): dordango's package container format (referred to as a bundle) can contain multiple packages. So it is possible to pack all your dependencies in a single ZIP, and install them in one go using that bundle. There's a bit of text about that in the dorodango manual, but here is how it works practically: % doro show-bundle my-app.zip # to view packages in the bundle % doro install --bundle=my-app-full.zip my-app # install my-app and all dependencies Ah, that's very cool! I actually meant support for installing the program without even dorodango installed - using only what comes with it. But I imagine you could create some sort of bundle with the 'doro install' code plus the app bundle. Eventually you'd want cross-platform bundling support too, but that gets more difficult. I imagine that software like LilyPond or GnuCash would be interested in a convenient way to use new Guile modules. Support for this has to be present in the conception of the package/bundle format, so indeed it's important that taking this feature into account from the beginning. I'm glad you've been thinking it through, then. Noah
Re: New feature proposal: Support C-code inline?
I just looked at the code. It's really cool! This looks like a way to write modules that use C and Scheme together in such a way that the code is all in one place. I think it could be much easier to read code written this way than code in separate C and Scheme files. What do other people think? Noah On Wed, Apr 20, 2011 at 5:50 AM, nalaginrut nalagin...@gmail.com wrote: hi all! I think somebody will need such a C-inline feature when he wants to write a smart C-function to replace the critical function in his algorithm. And I found a similar feature in Ruby. So I wrote a module to provide this. Anyway, I think the best solution is a AOT compiler for guile. But it seems to be dystocia. So this module provides an easy solution. Let me show it's usage: (use-modules (inline inline)) (define in (make inline #:name mytest)) ;; this name is arbitrary. But if you made more than one inline instance ;; in your program, each must has an unique name. Or shit happens. (inline:eat-code in int func(int a ,int b){ return a+b;}) ;; well~this example is too stupid. Try another as you will... (let ((myfunc (inline:get in))) (myfunc 5 6) ) === 11 Drawback: 1. Though this module will parse the function type list, it's not perfect. Users can only chose int/float/double/char/long/void. 2. Can't support unsigned in the function declaration. But you can use them in the function body. 3. Can't support pointers in the function declaration, it's a big drawback. This inline feature is restricted so much because of this pity. The good news is you can use it in the function body. 4. ...Anyone find more? Future: 1. Fix the known drawbacks. 2. Each inline instance will generate a share lib file. That means each inline function will have one share lib file. Though they won't conflict because the lib file name is bound to the program name which use it, I think it's not proper. Maybe we can add several inline functions to one inline instance, and generate just one share lib for each program. I don't know how much people is interested it. For some low-level programming guys, this module even provides inline assemble. I know this module is too simple for some guys, but it's easy to implement and easy to use. My patch is attached(actually it's not a patch...). Any advices? -- GNU Powered it GPL Protected it GOD Blessed it HFG - NalaGinrut --hacker key-- v4sw7CUSMhw6ln6pr8OSFck4ma9u8MLSOFw3WDXGm7g/l8Li6e7t4TNGSb8AGORTDLMen6g6RASZOGCHPa28s1MIr4p-x hackerkey.com ---end key---
Re: Indexing Scheme and C identifiers separately
Hello, I do not know how you are reading the Guile Reference Manual, but the printed version is about 809 pages long. At present, the indices run from page 755 to 809, so the revision that is suggested, above, would not be small. What would be of some help to get this project started is a list of the identifiers: 1) A list of all Scheme procedure names 2) A list of all C procedure names 3) A list of all Scheme variable names 4) A list of all C variable names 5) A list of all Scheme type names 6) A list of all C type names (By all names, I mean all names included in the Guile Reference Manual, not, for example, all C function names in Standard C.) Perhaps I am misunderstanding you, but I am afraid that you are not aware that the indices are generated automatically. The relevant file is guile-source/doc/ref/indices.texi. It contains three @printindex commands, which I suspect generate the indices. Therefore the way to do this is not to remake the indices ourselves, but to change the texinfo code that generates them for us. Unfortunately, I do not know enough about texinfo to know what is involved in this. So the good news is that it will be much less work than it might seem to re-do this. The bad news is that someone has to learn texinfo, unless one of us already knows it. Noah
Re: [PATCH 2/5] [mingw]: Have compiled-file-name produce valid names.
Is anyone interested in implementing a path library? Andy I might be able to work on it. I haven't done much for Guile lately, but I expect to have a lot more free time once my semester ends on May 7th. However, I don't know much about how Windows paths work. Are there any special considerations beyond the directory separator? Also, are there any characters that are valid in filenames on some systems but invalid on other systems? Noah
Re: [PATCH 2/5] [mingw]: Have compiled-file-name produce valid names.
Yep! Check that racket web page I linked to. You don't have to implement all of it, but it should be possible to implement, given the path abstraction. Okay, I've read it. It doesn't seem very complicated. Should we strive for API compatibility? I don't see any programs needing it right now, but maybe there would be in the future if we made them compatible. Ah, I see you are under the delusion that paths are composed of characters :) This is not the case. To the OS, paths are NUL-terminated byte arrays, with some constraints about their composition, but which are not necessarily representable as strings. It is nice to offer the ability to convert to and from strings, when that is possible, but we must not assume that it is indeed possible. Thanks! However, I'm also under a somewhat different delusion, which the Racket docs disagree with. I think of a path as a vector of path elements, each of which represents a directory except that the last one might represent a file. I notice the Racket path library makes their path object distinct from this - you can build a path from a list of path elements with build-path, and turn a path into a list of path elements with explode-path, but you can't take an actual path object and manipulate its components (unless I've missed something). Do you think this is the right way to think of it? I'd say that my way of thinking makes more sense if you think that a filesystem is really just a directed acyclic graph (well, usually acyclic), and a path is a list of graph nodes. I can't quite see what the alternative model is, but I have a feeling there's another way of thinking where Racket's path library makes more sense. Basically I think the plan should be to add scm_from_locale_path, scm_from_raw_path, etc to filesys.[ch], and change any pathname-accepting procedure in Guile to accept path objects, producing them from strings when given strings, and pass the bytevector representation to the raw o/s procedures like `open' et al. Then for a lot of the utilities, we can add (ice-9 paths) or something, and implement most of the utility functions in Scheme. Sounds like a plan. Noah
Re: [PATCH 2/5] [mingw]: Have compiled-file-name produce valid names.
Hello all, I have another issue to raise. I think this is actually parallel to some of the stuff in the (web) module, as you will see. I've always thought it was ridiculous and hackish that I had to escape spaces in path strings. For instance, I have a folder called Getting a Job on my desktop, whose path is ~/Desktop/Getting\ a\ Job. The reason this strangeness enters is that path strings are actually lists (or vectors) encoded as strings. Conceptually, the path ~/Desktop/Getting\ a\ Job is the list (~ Desktop Getting a Job). In this representation, there are no escapes and no separators. It always seemed cleaner to me to think about it that way. I think there should be some mechanism by which Guile users never have to think about escaping spaces (and any other characters they want in their paths). We don't have to represent them with lists or vectors, but there should be some mechanism for avoiding this. I said this is similar to the (web) module because of all of the discussion there of how HTTP encodes data types in text, and how it's better to think of a URI as URI type rather than a special string, etc. I think the same issue applies here - you've got list (or a list of lists, if you have a whole command-line with arguments) encoded as a string using ' ' and '/' as separators, and then you have to escape those characters when you want to use them in a different way, and the whole thing gets unnecessarily complicated because the right way to think about this is as lists of strings. Noah On Tue, May 3, 2011 at 11:59 PM, Mark H Weaver m...@netris.org wrote: Andy Wingo wi...@pobox.com writes: That's the crazy thing: file names on GNU aren't in any encoding! They are byte strings that may or may not decode to a string, given some encoding. Granted, they're mostly UTF-8 these days, but users have the darndest files... [...] On Tue 03 May 2011 00:18, l...@gnu.org (Ludovic Courtès) writes: I think GLib and the like expect UTF-8 as the file name encoding and complain otherwise, so UTF-8 might be a better default than locale encoding (and it’s certainly wiser to be locale-independent.) It's more complicated than that. Here's the old interface that they used, which attempted to treat paths as utf-8: http://developer.gnome.org/glib/unstable/glib-Character-Set-Conversion.html (search for file name encoding) The new API is abstract, so it allows operations like get-display-name and get-bytes: http://developer.gnome.org/gio/2.28/GFile.html (search for encoding in that page) All GFiles have a basename (get with g_file_get_basename()). These names are byte strings that are used to identify the file on the filesystem (relative to its parent directory) and there is no guarantees that they have any particular charset encoding or even make any sense at all. If you want to use filenames in a user interface you should use the display name that you can get by requesting the G_FILE_ATTRIBUTE_STANDARD_DISPLAY_NAME attribute with g_file_query_info(). This is guaranteed to be in utf8 and can be used in a user interface. But always store the real basename or the GFile to use to actually access the file, because there is no way to go from a display name to the actual name. In my opinion, this is a bad approach to take in Guile. When developers are careful to robustly handle filenames with invalid encoding, it will lead to overly complex code. More often, when developers write more straightforward code, it will lead to code that works most of the time but fails badly when confronted with weird filenames. This is the same type of problem that plagues Bourne shell scripts. Let's please not go down that road. There is a better way. We can do a variant of what Python 3 does, described in PEP 383 http://www.python.org/dev/peps/pep-0383/. Basically, the idea is to provide alternative versions of scm_{to,from}_stringn that allow arbitrary bytevectors to be turned into strings and back again without any lossage. These alternative versions would be used for operations involving filenames et al, and should probably also be made available to users. Basically the idea is that invalid bytes are mapped to code points that will never appear in any valid encoding. PEP 383 maps such bytes to a range of surrogate code points that are reserved for use in UTF-16 surrogate pairs, and are otherwise considered invalid by Unicode. There are other possible mapping schemes as well. See section 3.7 of Unicode Technical Report #36 http://www.unicode.org/reports/tr36/ for more discussion on this. I can understand why some say that filenames in GNU are not really strings but rather bytevectors. I respectfully disagree. Filenames, environment variables, command-line arguments, etc, are _conceptually_ strings. Let's not muddle that concept just because the transition to Unicode has not yet been completed in the GNU world. Hopefully in the
Re: early termination for `map'
That makes sense. On Thu, May 5, 2011 at 11:24 AM, Andy Wingo wi...@pobox.com wrote: Hello, If you call `map' or `for-each' with more than one list, our versions of these operators will detect if the lists are of unequal length, and throw an error in that case. However, SRFI-1 has long provided an extension to this, to allow for early termination when any of the lists runs out. R6RS adopted this, and it looks like R7RS will ratify that. So perhaps it's time for us to change as well. This would also allow us to get rid of the hack in srfi-1.c in which, when and if GOOPS gets loaded, srfi-1 extends the `map' and `for-each' primitive generics with its own early-termination code, which in effect gives early termination to every `map' user, regardless of whether that module has imported srfi-1 or goops. Sometimes I think that Mikael put the Oops in Goops for a reason ;-) Andy -- http://wingolog.org/
Re: Some guile-unify activities
Cool! On Fri, May 6, 2011 at 5:18 PM, Stefan Israelsson Tampe stefan.ita...@gmail.com wrote: Hi, Just wanted to chime in and tell you a little about what I'm doing with the guile-unify package. First off, this is a tool to do backtracking effectively e.g. tree searches and make heavy use of dynamic variables. There is three activities ongoing. 1. type-checking examples 2. first order logic solvers (porting leanCop) 3. documentation 1. Documentation, I needed to recap somethings and started to improve on the examples above in order to achieve higher generality. As it turned out bug's in the underlying code is squashed all the time and new useful features for the macro package is dicoverwed. So the codebase is in a pretty fluent state still. On the other hand some of the basic features is being documented. 2. The core part of the first order logic leanCop solver has been translated to a macro package and are now in a pure scheme form. It's not that advanced. Then there is a language for how to state the problem especially a translational tool to gather information from a database of first order logic problems. The idea now is to mod that tool into spitting out a file of of the translation which are working then call guile and let guile read in that file and process it with the core solver. The idea is to test out a few alternative solving teqniques and evaluate their performance for the database. 3. For the type-checking examples I've been working with the assumptions of having fixed declarations for lambdas and deduce types for variables and their passage through the system. The thought is that in the end one need to allow lambdas to be explicitly typed in order to allow advanced type-checking and at the same time avoid infinite recursions. I think that it's not uncommon in Haskell programming to add type information in order to guide the type checker and being at such an infant stage having something useful and working must simple allow for type hint's in the code. It would be cool to let emacs interact with guile to give a tool to automagically construct type signatures aka (lambda (A) B) enter -- (lambda (integer) integer) E.g. be able to interactively deduce a type signature. I will try to add more type primitives like recursive types and so on to be able to match Haskel and typed racket. Generally the biggest problem with backtracking is complexity issues. One can design relatively small programs that makes the type checker work for ages. i will not adress this right now but there are plenty of short cut's to be taken in order to speed up the checker. It would be good if there was a standard way to enter type information in guile and if that information could be hooked into the tree-il representation. But until then I will just use a simple macro framework to enter typed functions. To note below if a lambda produces (or a b) then the rest of the code has to be tried with a and when checked then backtrack and be tried with b if both are type checked then we can proceed. (this is the naive and simple implementation) So we can have a general type checker but complexity wise it will be awful. --- So now this works in 1 sec, ** Code (have a geometric explosion in complexity e.g. works like 1000 lines of code) ** (tdefine (f X) (type (lambda (boolean) (or integer symbol)) ;; type signature is (type type-desc code ...) and gives a type hint to the lambda (let ((F (lambda (P) (type (lambda (boolean) (or integer symbol)) (if P 1 's) (F #t) (F #t) (F #t) (F #t) (F #t) (F #t) (F #t) (F #t) (F #t) (F #f ** first stage compiled code *** (check (lambda ((#{X\ 10764}# : boolean)) : (or integer symbol) (begin (let #{F\ 10770}# (lambda ((#{P\ 10771}# : boolean)) : (or integer symbol) (begin (if #{P\ 10771}# 1 (quote s (begin (begin (apply #{F\ 10770}# (#t)) (apply #{F\ 10770}# (#t)) (apply #{F\ 10770}# (#t)) (apply #{F\ 10770}# (#t)) (apply #{F\ 10770}# (#t)) (apply #{F\ 10770}# (#t)) (apply #{F\ 10770}# (#t)) (apply #{F\ 10770}# (#t)) (apply #{F\ 10770}# (#t)) (apply #{F\ 10770}# (#f -- Deduced Equation
Re: [PATCH 2/5] [mingw]: Have compiled-file-name produce valid names.
Hello all, I've been scanning some file api documentation and wondering what we could do that would translate across platforms reliably. I've been thinking of sort of concentric circles of operations, where the inner circles can easily be supported in a cross-platform way, and the outer ones require more and more hackery. What do you think of the following? Group 1: Treat pathnames as opaque objects that come from outside APIs and can only be used by passing them to APIs. We can support these in a way that will be compatible everywhere. Operations: open file, close file, stat file. In order to be useful, we might also provide a command-line-argument-file operation, but probably no reverse operation. Group 2: treat pathnames as vectors of opaque path components Operations: list items in a directory Group 3: now we need to care about encoding Operations: string-path, path-string. This will be much harder than groups 1 and 2. I think group 1 by itself would allow for most command-line programs that people want to write. If you add group 2, you could write find, ls, cat, and probably others. You need group 3 to write grep and a web server. My thought right now is that group 3 is going to have a complex API if we really want to get encodings right. Our goal should be that this complexity doesn't affect group 1 and group 2, which really should have very simple APIs. Now, some thoughts on group 3: Mark is right that paths are basically just strings, even though occasionally they're not. I sort of like the idea of the PEP-383 encoding (making paths strings that can potentially contain unused codepoints, which represent non-character bytes), but would that make path strings break under some Guile string operations? Also, when we convert strings to paths, we need to know what encoding the local filesystem uses. That will usually be UTF-8, but potentially might not be, correct? If we can auto-discover the correct encoding, we might be able to keep all of that in the background and just pretend that we can convert Guile strings to file system paths in a clean way. Noah On Wed, May 4, 2011 at 5:24 AM, Ludovic Courtès l...@gnu.org wrote: Hi Noah, Noah Lavine noah.b.lav...@gmail.com writes: The reason this strangeness enters is that path strings are actually lists (or vectors) encoded as strings. Conceptually, the path ~/Desktop/Getting\ a\ Job is the list (~ Desktop Getting a Job). In this representation, there are no escapes and no separators. It always seemed cleaner to me to think about it that way. Agreed. However, POSIX procedures deal with strings, so you still need to convert to a string at some point. So I think there are few places where you could really use anything other than strings to represent file names—unless all of libguile is changed to deal with that, which seems unreasonable to me. MIT Scheme’s API goes this route, but that’s heavyweight and can hardly be retrofitted in a file-name-as-strings implementation, I think: http://www.gnu.org/software/mit-scheme/documentation/mit-scheme-ref/Pathnames.html. I said this is similar to the (web) module because of all of the discussion there of how HTTP encodes data types in text, and how it's better to think of a URI as URI type rather than a special string, etc. Yes. Thanks, Ludo’.
ELisp Implementation?
Hello, Guile's Summer of Code project this summer was an implementation of Emacs Lisp. I am curious - what is happening with that? Is it progressing? I follow the list, but haven't heard anything since the decision on which project we wanted. Noah
Re: Patch to add (define-syntax (foo bar) ...) support
Hello, I agree that this is much shorter, but I'm worried about defining the short syntax in a way that forces you to choose between syntax-rules and syntax-case. What I mean is that you could just as easily have (define-syntax (foo bar) ...) expand to (define-syntax foo (syntax-rules () ((_ bar) ...))) It seems to me that this makes a somewhat arbitrary choice, which isn't great. I'd rather see some way to unify the two possibilities, but I don't know what that would be. There's also the possibility of making it expand to (define-syntax foo (syntax-case tmp ... ((bar) ...))) because it is more analogous to how regular procedures work. I don't know what the right choice is, but it's a good point that there probably should be something. Noah On Sun, Jul 3, 2011 at 4:19 PM, Chris K. Jester-Young cky...@gmail.com wrote: Hi there, When writing syntax-case macros, often one would write: (define-syntax foo (lambda (bar) (syntax-case bar ...))) This seems overly long-winded; it would be preferable to be able to write, instead: (define-syntax (foo bar) (syntax-case bar ...)) Attached is a patch that implements that. Note that there is nothing original in this patch---it's just a straight copy-and-paste of the define version immediately above, except changing define-form to define-syntax-form---so there should be nothing controversial from a correctness and/or copyright point of view. Let me know what you think. Many thanks, Chris.
Re: Patch to add (define-syntax (foo bar) ...) support
Except, it doesn't. My version doesn't insert either syntax-case or syntax-rules; it just inserts the lambda and lets you do whatever. Oh, I must have been temporarily insane. My apologies. :-) Your idea makes a lot of sense. Noah
Build Bug in wip-peg
Hello all, It's been a while, but I just got the wip-peg branch from the main repository and tried to build it. After doing make clean make, I get this error: Undefined symbols: _rpl_open, referenced from: _scm_open_file in libguile_2.0_la-fports.o _scm_load_objcode in libguile_2.0_la-objcodes.o _scm_copy_file in filesys.o _scm_copy_file in filesys.o _scm_open_fdes in filesys.o ld: symbol(s) not found However, I do not get an error when I build the master branch. Since wip-peg is based on the stable branch, I assume that some build change between stable and master made it work on my system. Can anyone suggest what that might be? Noah
Re: Build Bug in wip-peg
Oh yes, good idea. I didn't realize I had hit reply instead of reply-all. I'm replying to the list now so at least it will get a copy of the emails. :-) On Tue, Sep 6, 2011 at 4:41 AM, Andy Wingo wi...@pobox.com wrote: Hi Noah, Let's try to keep the list in the loop on future mails. My inbox is a bit lossy :) Cheers, Andy On Tue 06 Sep 2011 04:25, Noah Lavine noah.b.lav...@gmail.com writes: I have fixed this. There is currently a branch on Savannah called wip-peg-fixed which has all of the peg changes rebased on top of stable-2.0. It builds fine on my machine and runs the peg test suite with all passes. (I couldn't overwrite wip-peg without pulling and merging, and I was afraid I would mess up my changes, so I used a different name.) When I was trying to fix the compilation bug, I realized that for some reason wip-peg did not have the most recent version of the peg code. I don't know why that would happen, so I decided the safest thing to do was take my local branch of up-to-date peg code and rebase on top of stable-2.0, which is what I did. It now works. The next step, I think, is fixing the S-expression representation of some parts of the grammar. Then, finally, I think it might be ready to use. :-) Noah On Sun, Sep 4, 2011 at 10:50 AM, Noah Lavine noah.b.lav...@gmail.com wrote: No, that eliminates the problem. There's still a bug in compiling the peg code, but I have a fix for that too. Currently there is something odd going on with the peg test suite, which I will look at, and then we can finally fix the S-expression representation of the PEG grammars :-). Noah On Sun, Sep 4, 2011 at 6:14 AM, Andy Wingo wi...@pobox.com wrote: On Sat 03 Sep 2011 22:38, Noah Lavine noah.b.lav...@gmail.com writes: It's been a while, but I just got the wip-peg branch from the main repository and tried to build it. After doing make clean make, I get this error: Undefined symbols: _rpl_open, referenced from: _scm_open_file in libguile_2.0_la-fports.o _scm_load_objcode in libguile_2.0_la-objcodes.o _scm_copy_file in filesys.o _scm_copy_file in filesys.o _scm_open_fdes in filesys.o ld: symbol(s) not found However, I do not get an error when I build the master branch. Since wip-peg is based on the stable branch, I assume that some build change between stable and master made it work on my system. Can anyone suggest what that might be? wip-peg is based on stable-2.0, not master. Perhaps it was something fixed later in stable-2.0. Does the problem persist if you git merge origin/stable-2.0? If not, then we should probably rebase the branch again. It will be great when we finally merge this branch :-) Andy -- http://wingolog.org/ -- http://wingolog.org/
Re: The wonders of partial evaluation
This is excellent! Congratulations! An interesting note from my current project: a partial evaluator means you don't have to use macros to define the PEG parser (while keeping exactly the same efficiency as now): instead of having (define-peg pattern) be a macro that analyzes pattern and outputs code, you have, (interpret-peg pattern string) be a program that parses string with peg. Then to generate code, you just partially apply interpret-peg to pattern, leaving string undetermined. (I'm not sure if the partial evaluator currently does this, but it would be an interesting approach.) Noah On Thu, Sep 8, 2011 at 6:32 PM, Ludovic Courtès l...@gnu.org wrote: Hello! Three days ago I had the chance to attend William Cook’s excellent tutorial on partial evaluation at DSL 2011, called “Build your own partial evaluator in 90 minutes” [0]. A few times 90 minutes later, I am pleased to announce a new partial evaluator for Tree-IL: http://git.sv.gnu.org/cgit/guile.git/commit/?h=stable-2.0id=11671bbacbdd52039c77978bfe7f24a3316f6019 Partial evaluation is “the mother of all optimizations”, and in particular that of constant propagation and inlining. So that’s what the partial evaluator is here for. A few examples, showing code before and after partial evaluation: (let ((x 1) (y 2)) (+ x y)) = (const 3) (letrec ((even? (lambda (x) (or (= 0 x) (odd? (- x 1) (odd? (lambda (x) (not (even? (- x 1)) (and (even? 4) (odd? 7))) = (const #t) (letrec ((fibo (lambda (n) (if (= n 1) n (+ (fibo (- n 1)) (fibo (- n 2))) (fibo 7)) = (const 13) Here all the values are known at compile-time, so the partial evaluator does basically the same job as ‘eval’. (let ((f (lambda (x y) (if ( x 0) y x (+ (f -1 x) (f 2 y) (f z y))) = (let (f) (_) ((lambda (_) (lambda-case (((x y) #f #f #f () (_ _)) (if (apply (primitive ) (lexical x _) (const 0)) (lexical y _) (lexical x _)) (apply (primitive +) (const -1) ; (f -1 x) (toplevel y) ; (f 2 y) (apply (lexical f _) ; (f z y) (toplevel z) (toplevel y Here calls to ‘f’ have been inlined 3 times and specialized twice. Isn’t it great? :-) Note that currently this only works with lexical bindings, not across top-level bindings–i.e., a top-level binding never gets inlined. Of course the main difficulty is to make sure it behaves correctly in the presence of effects. Please let me know if you suspect a compilation problem (partial evaluation can be turned off, see ‘optimize.scm’.) Feedback welcome! Ludo’. [0] http://www.cs.utexas.edu/~wcook/tutorial/
More PEG
Hello all, It looks to me like the last thing needed before the peg branch can be used is to change some of the S-expression representations of the components. Here are the five that I think need changing, taken from the manual, with suggested replacements. -- PEG Pattern: zero or more a Parses A as many times in a row as it can, starting each A at the end of the text parsed by the previous A. Always succeeds. `a*' `(body lit a *)' change to: (* a) -- PEG Pattern: one or more a Parses A as many times in a row as it can, starting each A at the end of the text parsed by the previous A. Succeeds if at least one A was parsed. `a+' `(body lit a +)' change to: (+ a) -- PEG Pattern: optional a Tries to parse A. Succeeds if A succeeds. `a?' `(? a)' Old: `(body lit a ?)' change to: (? a) -- PEG Pattern: and predicate a Makes sure it is possible to parse A, but does not actually parse it. Succeeds if A would succeed. `a' `(body a 1)' change to: `(followed-by a)' -- PEG Pattern: not predicate a Makes sure it is impossible to parse A, but does not actually parse it. Succeeds if A would fail. `!a' `(body ! a 1)' change to: `(not-followed-by a)' The first three were chosen to match the string representation, and because *, + and ? are standard syntaxes for those ideas. The last two were changed to words because I think and ! aren't standard for the ideas of followed-by and not-followed-by, and it would be hard to remember which S-expression corresponded to which meaning, especially since there is an S-expression syntax 'and' which is completely different than ''. There's something I'm still a little unsure about, though. It's possible to get deeply nested S-expressions, like '(+ (and (* a) b)). Since +, * and ? only ever take one argument, it is possible to shorten such a list by letting people merge those elements into the next item, like this: '(+ and (* a) b). On the other hand, that could get confusing if you later try to extend peg syntax to match not just strings, but arbitrary sequences (which I have been thinking of). For instance, say you want to match at least one copy of the literal list '(a). Is it (+ '(a)) or (+ quote a)? What do you all think? Noah
Re: More PEG
Hello, The syntactic changes you propose all make sense to me, FWIW. Great! Then I will push an implementation soon unless someone else objects. A more general question about PEG: how do you bind a variable to the result of a pattern? For instance, if you want the result of (* a) to be bound to variable X–just like (match '(1 2 3) ((a _ ...) a)) binds A to 1. We currently don't have that ability. I can see that it would be very convenient, though, so maybe we should have it. Right now I think you could get the same thing by running match on the output of the peg - something like (match (peg:tree (peg-parse pattern tree)) ((list-of-as) ...)). Do you think that's a good enough interface, or should the binding happen directly? I can see that it would be more efficient if we didn't box up the results in a structure first. Thanks, Noah
Re: More PEG
Hello, Can you give an example of what ‘peg-parse’ and ‘peg:tree’ return? 'peg-parse' returns a match record, which contains the string that was parsed, the substring that actually matched, and the tree structure produced by the matching. 'peg:tree' gets the tree structure out of a match record. I hadn't really looked at this part of the module, but it looks to me now like the names aren't descriptive enough. Maybe we should change them before putting this in a release. Noah On Sat, Sep 10, 2011 at 5:35 PM, Ludovic Courtès l...@gnu.org wrote: Hi! Noah Lavine noah.b.lav...@gmail.com skribis: Right now I think you could get the same thing by running match on the output of the peg - something like (match (peg:tree (peg-parse pattern tree)) ((list-of-as) ...)). Can you give an example of what ‘peg-parse’ and ‘peg:tree’ return? Ludo’. PS: I would suggest removing the ‘peg-’ and ‘peg:’ prefixes. Users can still use a renamer if the names clash.
Re: More PEG
As promised, the 'wip-peg-fixed' branch now has an updated S-expression representation for PEG grammars, and updated documentation for it. As I understand it, the next thing to do before merging (and possibly the last thing!) is to think about what things should be named, so they can be easy for people to learn and forwards-compatible with whatever our future plans for parsing are. I will try to send an email soon with some thoughts on that. Noah On Sat, Sep 17, 2011 at 9:09 PM, Noah Lavine noah.b.lav...@gmail.com wrote: Hello, Can you give an example of what ‘peg-parse’ and ‘peg:tree’ return? 'peg-parse' returns a match record, which contains the string that was parsed, the substring that actually matched, and the tree structure produced by the matching. 'peg:tree' gets the tree structure out of a match record. I hadn't really looked at this part of the module, but it looks to me now like the names aren't descriptive enough. Maybe we should change them before putting this in a release. Noah On Sat, Sep 10, 2011 at 5:35 PM, Ludovic Courtès l...@gnu.org wrote: Hi! Noah Lavine noah.b.lav...@gmail.com skribis: Right now I think you could get the same thing by running match on the output of the peg - something like (match (peg:tree (peg-parse pattern tree)) ((list-of-as) ...)). Can you give an example of what ‘peg-parse’ and ‘peg:tree’ return? Ludo’. PS: I would suggest removing the ‘peg-’ and ‘peg:’ prefixes. Users can still use a renamer if the names clash.
Defining Functions With Syntax
Hello all, I was just going through updating the PEG documentation when I encountered a very weird situation, and I don't know how to solve it. As you probably know, the PEG system can take an s-expression representation of a PEG grammar (a peg s-exp) and turn it into a function that can parse strings of this grammar. A macro called 'define-nonterm' does this at compile-time, but what if you build up a PEG grammar at runtime and want to parse something with it? The PEG documentation recommends doing '(eval `(define-nonterm as body ,exp) (interaction-environment))', where 'exp' is whatever your s-expression is. This seems like a bad idea to me, because a) it also defines a variable 'as', which is not needed and maybe not what you want and b) it seems unnecessarily complicated. This recommendation is especially strange because part of the PEG API is the function 'peg-sexp-compile', which actually turns a peg s-exp into a parsing function. Unfortunately, peg-sexp-compile returns syntax, because that is what is convenient for define-nonterm. So at long last, my question is this: is there a way to take some syntax and turn it directly into a function, without putting it inside a macro and calling 'eval'? I think there should be, but I don't see it in the manual. If there is not, I'd like to figure out an API and add it. Thanks, Noah
Re: Defining Functions With Syntax
Hello, It is complicated though. The other option is to write a PEG interpreter or compiler in Scheme. An interpreter would interpret the s-expressions directly, or some simplified form of them. A compiler would pre-process the s-expressions to produce a procedure, built from closures. Yes indeed. The thing is, we already have a compiler written in Scheme. The trouble is that it compiles S-expressions to syntax, and then it takes some trouble to get that syntax into a procedure. I think it might be cleaner if I could directly convert syntax into a procedure. For instance, what about doing (compile (peg-sexp-compile some-s-expression) #:from scheme-syntax #:to value) ? That seems like a nice API to me. Noah
Re: Defining Functions With Syntax
Hello, If the s-expression is a compile-time constant, and if the PEG compiler is written in purely-functional style and confined within a single top-level form, then peval should be able to produce the parser procedure at compile-time. Otherwise, it would be produced at run-time. This is part of the reason partial evaluation is so exciting. It's not just about increasing the speed of existing programs. More importantly, it allows you to write new programs more elegantly. I agree, and in fact I sent an email a few days ago saying the same thing. :-) I suspect that any macro that doesn't introduce variable bindings could be written that way, which would be excellent. It might also be much easier to write, because there are better debugging facilities for interpreters than macros, like the trace command and the debugger. I hope peval gets to that point soon. Since you've brought it up, I've been thinking of a feature I would want before using that style. peval by default will have to restrict what it inlines to save space. I'd like a way to guarantee that the things I want inlined would be inlined, by calling peval myself. So the best way to do PEG might be something like (define (peg-sexp-compile sexp) (peval (lambda (string) (peg-sexp-interpret sexp string)) #:full)) or something along those lines, where #:full indicates that all of peg-sexp-interpret should be inlined. (Although in real life you'd want a much better way to indicate what should be inlined, and I don't know what it is yet. I haven't thought through the interface at all.) That is a long term idea for peval, though. It looks like there are many more urgent things to do. Noah
Re: Defining Functions With Syntax
For instance, what about doing (compile (peg-sexp-compile some-s-expression) #:from scheme-syntax #:to value) ? That seems like a nice API to me. You could do the same with #:from 'scheme, no? I don't think so, because I think #:from 'scheme expects an S-expression, but peg-sexp-compile returns a syntax object. I might be wrong, though. Noah
Re: Defining Functions With Syntax
Oh, excellent! I had assumed it did not. I will update the PEG documentation. On Wed, Sep 21, 2011 at 3:22 PM, Andy Wingo wi...@pobox.com wrote: On Wed 21 Sep 2011 20:44, Noah Lavine noah.b.lav...@gmail.com writes: I think #:from 'scheme expects an S-expression, but peg-sexp-compile returns a syntax object. I might be wrong, though. (compile #'1) = 1 Compiling a syntax object works. Andy -- http://wingolog.org/
Names for PEG Functions
Hello all, As the PEG module nears some reasonable level of completion, we should figure out what all of the functions need to be named so everyone can reasonably understand them. At first I thought the names should be consistent wtih the LALR and regexp modules, so all of the parsing modules would agree. But after looking at those two, I realized that they aren't consistent with each other, so it would be very difficult to make PEG match them. At least, however, PEG can be consistent with itself. Therefore I suggest the following names and meanings: define-peg-sexp - define a nonterminal from an s-expression define-peg-string - define a set of nonterminals from a string compile-peg-sexp - compile an sexp to a nonterminal (an opaque value to the user, but really just a function) compile-peg-string - compile a string to a nonterminal match-peg - match a peg to a string, starting at the beginning search-peg - match a peg to a string, starting at each index in turn until we find a match or reach the end I realize that putting 'peg' in the names isn't really necessary because the user could use a module renamer, as Ludovic pointed out a few days ago. I put 'peg' in the define-* syntax because I thought 'define-sexp' and 'define-string' were too general as names, and then I wanted the compile-* functions to be consistent with them. As for the others, 'match' and 'search' seemed too general. The verb comes first in all of these because I think putting 'define' in the beginning of syntax that defines things is standard, and then I made the other names to match that one. If anyone has better ideas, though, I'm not attached to these names. Looking forward to merging this, Noah
Re: Names for PEG Functions
Hello, define-peg-sexp - define a nonterminal from an s-expression define-peg-string - define a set of nonterminals from a string To me this sounds like you are defining an sexp or a string, which doesn't make much sense. I don't think that we need to preserve symmetry here, because the first binds one identifier, whereas the second binds a number of identifiers. (Is that really necessary? It would be nicer if it just bound one identifier, or something. Dunno. Then how about define-peg-pattern for the s-expression one, and define-peg-string-patterns for the strings? That at least includes the difference in number of things bound, and also matches the names for the compile-* functions. As for binding more than one identifier - you have to bind patterns to variables if you want to reuse them in other patterns later on. If you know in advance what patterns you will be matching, you don't need to define any other names, but we don't really know that. One of the advantages of PEG is the idea that you can reuse portions of grammars in other grammars, so they compose nicely. Also, are the different `accum' things necessary? Just wondering. Unused bindings will probably be removed by the optimizer. Well, you can choose how much to accumulate at each s-expression, and this makes that choice for the top level. You have to make some choice at each level. The other option I can think of is to pick something as default, and then say that if you want to change it you can indicate that in the s-expression (via the special forms that do that). compile-peg-sexp - compile an sexp to a nonterminal (an opaque value to the user, but really just a function) compile-peg-pattern perhaps ? compile-peg-string - compile a string to a nonterminal compile-peg-string-pattern ? Sure. Just a note, though - this seems to make an s-expression pattern the default, and string a special case. (That's how I think of it too, but I realize that not everyone does :-) ). match-peg - match a peg to a string, starting at the beginning match-pattern ? search-peg - match a peg to a string, starting at each index in turn until we find a match or reach the end search-for-match ? How about 'search-for-pattern' instead, because everything else uses 'pattern'? I realize that putting 'peg' in the names isn't really necessary because the user could use a module renamer, as Ludovic pointed out a few days ago. I put 'peg' in the define-* syntax because I thought 'define-sexp' and 'define-string' were too general as names, and then I wanted the compile-* functions to be consistent with them. As for the others, 'match' and 'search' seemed too general. Yeah, dunno. What do you think about these names? Please don't take these suggestions too seriously. Your names seem good. I want the names to be decently self-consistent and descriptive, but I don't care much beyond that. Noah
A Plan for Hacking
Hello all, It looks to me like the PEG project is wrapping up. We're talking about what to name things now, which means the library is likely pretty close to inclusion in Guile. After this I will want another Guile project to work on. My original intention was to work on a JIT compiler for Guile, but after following the list for a while, it seems like work on an ahead-of-time compiler might be more useful in the long term. This email is about my idea for what I will spend time on over the next several months. I'd like to talk about it beforehand to make sure that what I do will be useful, and so that I can coordinate with other people. This is especially important because I think we should use an unusual architecture for our compiler. I think the next step is to write a static analyzer. This analyzer will be used by the compiler, but it will not just be for the compiler - it will also be available to users. I think of it as a tool for writing correct programs. Here are some use-cases for a static analyzer: - You write a function that assumes its arguments are of a certain type. You'd like to be sure this is true, so your program won't throw exceptions in the middle. - You write a program with more than one thread, which includes a series of mutexes. You'd like to verify that if a thread grabs more than one of these mutexes, then it always does so in a certain order. - You write a program that will respond to HTTP requests. You'd like to verify that information from a certain set of variables only influences your response packets if the user has a certain cookie (i.e. verify that your web server is secure). - You write some C code that manipulates Guile values. You'd like to be sure that the code does all of the appropriate Guile typechecks before using Guile library functions. (This is taken from the Guile to-do list, on the website.) - You write a compiler for Guile code. You'd like to infer the types of variables at compile-time. I also have three design considerations that I think are important for an analyzer like this: - For any given proposition to be checked, it should say either yes, this is true, not true, and here's how it might fail, or I can't tell if this is true or false. We cannot omit the last condition, because an analyzer without that option that can satisfy all of these use-cases is impossible (Godel's theorem). - It should work for multiple languages. This is necessary to do the C typechecks from the last use case, but I also hope it will be expanded to more languages as part of Guile integrating its compiler with GCC someday. - It should integrate nicely with all languages it supports. If it isn't easy to use, people will be less likely to use it. I said we should use an unusual compiler architecture because I think that the static analyzer, which is normally part of the compiler, should be thought of as its own component, which can be used independently of the compiler. I hope the use-cases above have convinced you that having an analyzer around could be a useful thing, but if not, I'm happy to discuss it more. (I will also make an argument that a static analyzer should be part of the interface to the compiler, if you are interested.) So this is my proposal for the next several months: I work on a static analyzer for Guile, hoping to expand it to other languages as well. The main goal will be to make it very easy to use this analyzer in code, including providing helpful feedback to users when their assertions fail. While developing this, I will make sure that it can check things that would be useful to a compiler. Once we have a static analyzer, we can use it as the first step in an optimizing compiler. What do you think? Noah
Re: A Plan for Hacking
Hello, - You write a function that assumes its arguments are of a certain type. You'd like to be sure this is true, so your program won't throw exceptions in the middle. That would be cool. However, I suspect that for best results you’d want procedures to have type annotations, as in Racket’s Typed Scheme and Bigloo; otherwise it may be impossible to determine type info most of the time—e.g., a public top-level procedure used in another module. (It’s not clear to me whether this can be done without seriously compromising on dynamic module composition.) I think it can be done effectively, but I'm not quite sure what you mean by dynamic module composition. Do you mean that this might prevent you from making modules that use arbitrary other modules? Or that it might not work if you loaded modules at runtime from the REPL? I agree that we'll want type annotation. But you can also think of type annotation as a special case of the more general idea of caching intermediate steps to speed up analysis. After all, in theory you could always rederive the type information for each function from its source code every time you needed it. I wonder if this is the only case of caching that will matter, or if it is worthwhile to just make a more general caching mechanism. [...] These three items seem “off-topic” and less important to me. Okay. I'm not sure what I think about them right now, because I am very intrigued by the idea of using the analyzer to help correctness, but you might be right. I will think about it more. I thought of another use-case, and I wonder if you think this is on-topic or not: - You write a parallel version of map that is fast, but only correct if its argument has no side-effects. You'd like to use this parallel map instead of the regular one, but only when it is correct. To reduce frustration, I would narrow the use cases a bit, or define short-term milestones. For instance, you could decide to focus solely on type inference, or have type inference one of your major milestones. Yes, type inference is a major thing. As I was thinking about a compiler, the three things that seemed useful to me were type inference, determining the outlines of big data structures that were passed around (which could be type inference, depending on what your type system is), and figuring out the lifetimes of heap-allocated values. Kanren [0] and its companion book look like nice starting points to me. In particular, there’s a simple type inference engine written in Kanren that is actually usable with Guile 2.0 out of the box and could serve as a starting point. Great! I will look at those. I have also found a book called Software Foundations [0], which is an introduction to the Coq theorem prover. It might be useful to be familiar with a few systems before starting my own (or perhaps not even starting my own, but using Kanren). Do you also happen to know of a book on formal logic systems in general, so I can get more perspective? (I have one, but it doesn't cover as much as I wish it did.) I'm also interested in what Stefan Tampe is doing, because I believe he's implementing a formal logic system in Guile. Stefan, does your project connect to this one? And before that, make sure PEG actually gets merged. ;-) Yes, I will make sure that happens. Currently PEG is waiting on an email about names. Once we agree on those, then I know of nothing further blocking it from being merged. Thanks, Noah [0] http://www.cis.upenn.edu/~bcpierce/sf/
Re: GNU Guile branch, stable-2.0, updated. v2.0.2-101-gd851e32
Hello, (Though I think that the term ‘constant’, as used by GCC, is confusing. I don’t have a better name to propose, though.) How about 'algebraic'? As in functions in basic algebra. (I know it has a different meaning, though.) Or alternatively, something referencing the lambda calculus, since all lambda calculus functions have this property. Noah
Re: A Plan for Hacking
Hello, I think it can be done effectively, but I'm not quite sure what you mean by dynamic module composition. Do you mean that this might prevent you from making modules that use arbitrary other modules? Or that it might not work if you loaded modules at runtime from the REPL? I was referring to the ability to connect code snippets at run-time (by means of use-modules, C-x C-e in Geiser, etc.) and have an evolving, live system. Having types associated with values makes this easier. It may become harder once the compiler has built assumptions into the generated code about the types each procedure expects. I agree. I think that for statically generating code, where we want to do as much optimization as possible, we need something called a program definition, which is a list containing all entry points to the program. Then you could assume that all control flow starts at those entry points. I also think of this as something that the compiler will eventually use as well - for instance, if you want to generate a library, you would provide a list of the functions that would be entry points. For a standard unix program, there would be just one entry point. (The big advantage for the compiler is that the list could also contain ABI information, which is useful for people who make libraries that need to be ABI-compatible. I can argue for this more if you like.) It's not clear to me how well analysis will work in dynamically-edited code, but I would like to do as much as possible there. I think the key to that will be a combination of caching generated code, and invalidating the caches when necessary, and generating specialized versions of functions that are called by other compiled code but are separate from the generic entry points to those functions. About Kanren, there’s “The Reasoned Schemer”. I've ordered that book; it should be here soon. My plan is to read things for the next couple weeks, to make sure I really understand how an analyzer needs to work, come up with a plan for Guile's, and then email the list again to discuss it. Thanks, Noah
Re: Guile 2.0.x for cygwin, debian, ubuntu
I believe debian maintainers are responsible for packaging Guile for Debian. You should contact whoever maintains the Guile package there. I don't know anything about cygwin. On Thu, Sep 29, 2011 at 6:38 AM, roman ro...@web.de wrote: Hello, the last version of guile available for cygwin and debian/ubuntu is now 1.8.x. Is there a plan to build the guile 2.0.x packages for cygwin and debian/ubuntu ? Roman
Re: Names for PEG Functions
Hello, I hate to make more work for people, but I think the PEG module is almost ready for merging, and could probably be merged if we resolved this names issue. Any other thoughts? Noah On Thu, Sep 22, 2011 at 1:56 PM, Noah Lavine noah.b.lav...@gmail.com wrote: Hello, define-peg-sexp - define a nonterminal from an s-expression define-peg-string - define a set of nonterminals from a string To me this sounds like you are defining an sexp or a string, which doesn't make much sense. I don't think that we need to preserve symmetry here, because the first binds one identifier, whereas the second binds a number of identifiers. (Is that really necessary? It would be nicer if it just bound one identifier, or something. Dunno. Then how about define-peg-pattern for the s-expression one, and define-peg-string-patterns for the strings? That at least includes the difference in number of things bound, and also matches the names for the compile-* functions. As for binding more than one identifier - you have to bind patterns to variables if you want to reuse them in other patterns later on. If you know in advance what patterns you will be matching, you don't need to define any other names, but we don't really know that. One of the advantages of PEG is the idea that you can reuse portions of grammars in other grammars, so they compose nicely. Also, are the different `accum' things necessary? Just wondering. Unused bindings will probably be removed by the optimizer. Well, you can choose how much to accumulate at each s-expression, and this makes that choice for the top level. You have to make some choice at each level. The other option I can think of is to pick something as default, and then say that if you want to change it you can indicate that in the s-expression (via the special forms that do that). compile-peg-sexp - compile an sexp to a nonterminal (an opaque value to the user, but really just a function) compile-peg-pattern perhaps ? compile-peg-string - compile a string to a nonterminal compile-peg-string-pattern ? Sure. Just a note, though - this seems to make an s-expression pattern the default, and string a special case. (That's how I think of it too, but I realize that not everyone does :-) ). match-peg - match a peg to a string, starting at the beginning match-pattern ? search-peg - match a peg to a string, starting at each index in turn until we find a match or reach the end search-for-match ? How about 'search-for-pattern' instead, because everything else uses 'pattern'? I realize that putting 'peg' in the names isn't really necessary because the user could use a module renamer, as Ludovic pointed out a few days ago. I put 'peg' in the define-* syntax because I thought 'define-sexp' and 'define-string' were too general as names, and then I wanted the compile-* functions to be consistent with them. As for the others, 'match' and 'search' seemed too general. Yeah, dunno. What do you think about these names? Please don't take these suggestions too seriously. Your names seem good. I want the names to be decently self-consistent and descriptive, but I don't care much beyond that. Noah
ELisp?
Hello all, Could anyone tell me the status of the ELisp implementation since July 21st (the last update)? I'm especially interested now because of a thread on emacs-devel - http://lists.gnu.org/archive/html/emacs-devel/2011-09/msg00753.html. People were talking about adding multithreading to Emacs. Some people said that moving Emacs to Guile might be the best solution, but others were pessimistic about that ever happening. It looks like the Emacs people would really like to use Guile, and we would certainly like them to. Also, every time Emacs implements a feature that Guile already has, it's wasted effort when we eventually integrate them. It seems like Emacs-Guile integration is a project that is on the verge of happening, and just needs some sort of push to get it over the hump. I hope that this summer's SoC work can be the impetus. Noah
Re: ELisp?
That makes sense. I was hoping we could skip the phase where we have two lisp interpreters around, if Guile's ELisp implementation was good enough, but maybe that will not be possible. Emacs already does something similar to this for the programming language modes, where it has another interpreter as a subprocess, so this doesn't sound so bad technically. One question I have is, how much of Emacs' C code can Guile use right now? I guess Ken Raeburn's patches are the ones that address that, but I don't know what is left to do. (I also have the same question about ELisp - what of the language is left to implement? Is it just adding more builtins?) I don't know if this would work, but the way I was imagining it happening is really just starting up Guile and loading Emacs source files, one by one. If we have a complete ELisp implementation with all of Emacs' C code, then in theory Emacs would start. I don't know what would actually happen though. A branch on Savannah sounds like a really good idea though. Then at least there would be a central thing for people to look at and hack on, and more people might get involved. Right now I suspect that most people don't know what's happening or where to look. If you set up a branch, I for one would love to check it out and perhaps contribute some patches. Do you need permission from the Emacs maintainers to do that, or anything else? Noah On Sun, Oct 9, 2011 at 9:37 AM, joa...@verona.se wrote: Noah Lavine noah.b.lav...@gmail.com writes: Hello all, Could anyone tell me the status of the ELisp implementation since July 21st (the last update)? I'm especially interested now because of a thread on emacs-devel - http://lists.gnu.org/archive/html/emacs-devel/2011-09/msg00753.html.People were talking about adding multithreading to Emacs. Some people said that moving Emacs to Guile might be the best solution, but others were pessimistic about that ever happening. It looks like the Emacs people would really like to use Guile, and we would certainly like them to. Also, every time Emacs implements a feature that Guile already has, it's wasted effort when we eventually integrate them. It seems like Emacs-Guile integration is a project that is on the verge of happening, and just needs some sort of push to get it over the hump. I hope that this summer's SoC work can be the impetus. Noah I'm an Emacs developer(not the most prolific but I have contributed ImageMagick core integration and Webkit integration for Emacs 25) I'm now trying to learn non-trivial Guile usage so as to be useful in the future merge of Guile and Emacs. Currently, for me coming from the Emacs side is that I have no idea how integration would happen. I've studied the various attempts but haven't come to any conclusion. My naive approach would be to make an Emacs Guile branch on Savannah. The branch would unify incrementally the 3 different existing approaches: - Nishidas patches to run Guile in Emacs - Ken Raeburns patches to replace the Emacs Lisp engine - Templetons implementation of Elisp in Guile This branch would initially use Nishidas strategy of just linking Guile and providing some simple interoperability functions. The version linked would be Templetons Elisp Guile version. So, there would be 2 lisp engines in Emacs for a while. When things work well the old Emacs lisp engine would be ripped out. As I'm coming from the Emacs side of things my interest is in getting Guile in the Emacs core efficiently. My motivation is somewhat selfish because I don't want to re-implement all the interesting features Guile has in Emacs, such as dynamic FFI etc.. Of course there are many details to work out. A random example could be that we would like a cookie to indicate that eval-buffer should use guile rather than elisp for a buffer. The advantage of the above approach is that it is incremental. We gain exposure in the Emacs developer community. Since Emacs 24 is entering pretest its a good time to start to think of features for Emacs 25. -- Joakim Verona
Tree-IL Questions
Hello again, I've been working on the compiler issue and I've had to start looking at processing Tree-IL. As a result, I realized that the Tree-IL documentation is a bit out-of-date. I would like to fix it, but there are some things I don't understand. - It looks like every Tree-IL type has a src slot, correct? But if so, why doesn't the record-case at module/language/tree-il.scm line 271 (unparse-tree-il) match those src slots? Especially since the big match statement in peval (line 682) does match them (unless it matches no entries in the record)? - On a related note, why do most of the Tree-IL record type not appear in the define-type statement in tree-il.scm line 133, and is that connected with the borrow-core-vtables macro that I don't understand at all? :-) - Is it guaranteed that the exp slot in a const record will always be a Scheme value, or can it be a Tree-IL expression? Thanks, Noah
Re: Tree-IL Questions
Hello, Record-case parses slots by name, not by position. Match destructures by position, not by name. Oh, interesting. - On a related note, why do most of the Tree-IL record type not appear in the define-type statement in tree-il.scm line 133, and is that connected with the borrow-core-vtables macro that I don't understand at all? :-) Yes. Then I have a new goal: understand that. :-) Where are the docs not up-to-date? In stable-2.0 they should be current. That may be the issue, because I am using the latest git head. There were only very small issues, but I thought I should fix them anyway. Specifically, - some records weren't listed with a 'src' slot - the first two slots for letrec were out of order - it documented sequence instead of the newer seq, which has a slightly different structure The attached patch fixes all of these, I think. Have a good day, Noah 0001-Update-Tree-IL-Documentation.patch Description: Binary data
Re: Tree-IL Questions
Then I have a new goal: understand that. :-) The deal is that macroexpansion produces tree-il. The usual macroexpander is provided by psyntax. But before psyntax is loaded, when the first bits of Scheme are seen, there is a boot expander written in C that produces tree-il, with support for a subset of Scheme. This is in expand.[ch]. Obviously it can't load scheme in order to get the vtable for seq, for example, so it defines some core subset of tree-il in C. Then when tree-il loads, it grabs that core subset and re-exports it. Does that help? Yes, and thank you very much for the explanation!
Re: Guildhall
Hello, I am certainly not an authoritative source, but it's been a while, so here is what I know. As far as I can tell, these questions haven't been decided yet. All we know is that we want a guildhall. However, having your packages in it would be great, so please do package them up and contribute them when it is ready. Thanks a lot for being willing to contribute. Noah On Tue, Nov 15, 2011 at 5:52 PM, Simon Haines simon.hai...@con-amalgamate.net wrote: I'd like to start looking into getting some of my libraries ready for the guildhall, and have a few quick questions. How are library namespaces going to be determined? I can see a couple of approaches: a) authoritative (contributors apply to a central authority for a namespace upon library contribution), or b) federated (contributors have a namespace, perhaps based on a login name, under which all their libraries live) I guess it's a matter of taste: (import (guildhall db nosql mongodb (1))) vs (import (guildhall simonhaines mongodb (1))). Perhaps this is mere bikeshedding and the best approach is 'anything goes'. What's the contribution process going to look like? I assume a 'contribute' form on a web page somewhere, similar to PLaneT or CPAN. Some infrastructure may be required to support this: repository hosting and an announce mailing list. On the latter points, the last I heard guildhall.gnu.org was being sought for hosting, and I guess guile-user would be a good announce list. I think the guildhall would be a fantastic development for guile, and I'd like to do all I can to get the ball rolling, from paper-chasing to helping out with the dorodango integration. Just let me know how I can help. Simon.
Re: *current-language*
I guess in general I'd prefer something like Racket's #!lang directives, though I'm not opposed to this approach. Dunno! How about using language directives when available, and trying to guess the language when not? And about the directives, what should they be? ',language' is what we used to use at the REPL, and I was about to write an email arguing for that, until I realized that there is one big advantage to using the same thing as Racket - we might actually share some languages. Specifically, if someone had a Scheme file that started with '#!lang r5rs', it's possible that Guile and Racket could both run that file without modification. That seems pretty cool. Mixing other languages could be scary though. Noah
Update and Questions
Hello Guile developers, Despite the long gap since my last email, I have in fact been working on the compiler project. This email has two purposes: first, to make sure that I have a basically reasonable design before I work more, and second, to ask a logistical question. First of all, I very much appreciated your suggestions of things to read. I did look at the Reasoned Schemer. I also spent quite a while studying Stalin (the compiler), because I believe that is the current fastest Scheme compiler. (And therefore, it's the one I'd like to beat with our compiler.) The design I am currently working with is the same as Stalin's design. It is a propagation network. Basically you have a graph of Tree-IL nodes, where each Tree-IL node is connected to its parent, its children, and optionally other nodes (a variable is connected to the place it was declared. A primitive function is connected to the error continuation it could jump to). The compiler propagates information along the edges of this graph, attempting to find the minimum set of possible values for each variable and each jump target. It also performs partial evaluation as a special case of this. What I have described also seems to be how Hindley-Milner type inference works (although I think we can do a bit better than standard Hindley-Milner systems). Does this seem like a good design for the static analyzer portion? One big oversight that I know of is that this doesn't do any sort of specialization. I think that is an extremely important feature, but when I looked at doing specialization I thought that it would use all of the logic this would, but require a bit more work to keep the complexity down, so I thought I should do this first. I haven't thought much about the rest of the compiler yet, but I imagine you can basically walk this graph and try to choose representations for each piece of it using the information you gathered in the analysis phase. The key ingredient for that will be a library of representations for each Tree-IL node. The second part of this post is a logistical question. I would like to do this work in public, so other people can see it (and hopefully contribute). Where is the best place to do this? Should I make a branch on Savannah? That makes the most sense to me, but it seems a bit unnatural, since this project is very separate from the rest of Guile. And finally, a fun thing. The current best Scheme compiler (that I know of) is called Stalin. Therefore, I suggest that our compiler be called Stallman, after our fearless leader (whose name also starts with s). :-) Thanks a lot, Noah
Re: Compiler Branch
Cool. As a quick reaction, I have some doubts about this project. But, I guess a WIP branch would be a good thing to have, and it would make the discussion more concrete. Probably so. But if you have time, what are your doubts? I would much rather talk about problems now than after I've implemented something that we'll have to change. Noah
Re: Anything better for delayed lexical evaluation than (lambda () ...)?
Hello, I haven't really been contributing to this thread, so please take my opinion with a grain of salt. But it does appear to me that we should support capturing a lexical environment, as Mark and David describe. So I took a look at ice-9/eval.scm to see how difficult it would be to implement. Offhand, it doesn't look bad: the eval function there already passes around environment objects, so if it hit this special form, it would simply return its environment object (probably packaged up in a record so it would print nicely). Restarting it is also simple: call eval on an expression with the given environment. The environment objects already contain all of the information needed to evaluate expressions, so I don't think there is very much to do there. The part that seems more interesting to me is that Guile's evaluator attempts to memoize an entire expression before evaluating any of it, which I understand is impossible with Lilypond. I assume Lilypond handles this by bundling the Lilypond code into a string (or some other object), letting the memoizer look at that, and then later doing the actual expansion. David, is this how you handle that? Noah On Tue, Dec 13, 2011 at 2:15 PM, Mark H Weaver m...@netris.org wrote: Andy Wingo wi...@pobox.com writes: On Tue 13 Dec 2011 18:28, Mark H Weaver m...@netris.org writes: (let ((xxx 2)) #{ #(set! xxx (1+ xxx)) #}) In the general case, Lilypond needs to _execute_ the outer Scheme code before the parser/evaluator is able to even _see_ the inner Scheme code, because it needs to parse/evaluate the Lily code in between the two, and we've already established that parsing cannot be not be done without runtime information. What does it mean to execute a `(let ((xxx 2))' ? I think I need to read the thread again, because I am really not getting it. Well, this example is a bit too simple to illustrate the point. Let's consider a slightly more complex example: (let ((xxx (foobar 1 2))) #{ #(begin (set! xxx (1+ xxx)) (let ((yyy (foobar 3 4))) #{ #(set! yyy (+ xxx yyy)) #} )) #} ) In this case, Lilypond would need to start by evaluating: (let ((xxx (foobar 1 2))) (capture-lexical-environment)) which entails evaluating (foobar 1 2), extending the lexical environment with a binding for xxx, and then returning a new lexical environment object. Then Lilypond would then continue to parse/evaluate the Lilypond code beginning with #{, which in the general case must be done at the same time as execution. When it finds the #( it enters Scheme mode again, so it would then pass the lexical environment object from the previous step to local-eval with the following expression: (begin (set! xxx (1+ xxx)) (let ((yyy (foobar 3 4))) (capture-lexical-environment))) which entails mutating xxx, evaluating (foobar 3 4) and extending the lexical environment again (which should now contain both xxx and yyy), and then returning a new lexical environment object. And so on. Does this make sense? How difficult would it be to implement this? Dunno. I was thinking that we could have a special form to return a list of the identifiers in scope. I don't think this is sufficient. The special form must return a lexical environment object that contains everything needed by a local-eval procedure (which we should also provide) to evaluate arbitrary scheme code within that lexical environment. The key is that we must create the lexical environment object before we know anything about the code that will later be passed to local-eval. Thanks, Mark
Re: Anything better for delayed lexical evaluation than (lambda () ...)?
Perhaps this is obvious to everyone else, but it just occurred to me that (capture-local-environment) is just (call-with-current-continuation), but running in the environment of the evaluator instead of the program being evaluated. It's as though the evaluator was going to look in a tree for more code, but hit a special node and did a (call/cc). I hope other people find this interesting. Good morning, Noah
Re: summary: lilypond, lambda, and local-eval
Hello, Indeed, only the macro expander has enough information to generate an optimal list of reachable lexicals, i.e. lexical variables that are accessible using normal symbols (as opposed to syntax objects) [more on this below]. Are you certain that you want to restrict the set of identifiers? To me it sounds like an optimization, perhaps premature. If I understand correctly, Mark wants to restrict the set of variables you can access to those you could access through normal Scheme code. This is an issue because psyntax happens to provide a way to access more variables than standard Scheme. If this is the case, I think we should absolutely restrict it to standard Scheme, because letting you access everything psyntax can access a) is not Scheme and b) restricts our future implementation choices. What do you think about having a the-environment tree-il form have a field for the names and a field for the gensyms of captured lexicals? This is great news, because it means that `the-environment' will no longer require its own tree-il type, and the compiler will only see the standard scheme code that it expands into. This actually seems bad to me, although I'm just guessing. Because the thing is, it's not *really* that Scheme code that you wrote, and so the compiler is seeing something wrong. It has the same variable-capture properties as that code, but it's not actually that. My instinct is that compiling non-accurate code is going to be more trouble than it's worth, but that's just a guess. In general, this thread has been very, very impressive. Thanks a lot to everyone who has been working so hard on this. Noah
Re: summary: lilypond, lambda, and local-eval
Hello, If I understand correctly, Mark wants to restrict the set of variables you can access to those you could access through normal Scheme code. This is an issue because psyntax happens to provide a way to access more variables than standard Scheme. If this is the case, I think we should absolutely restrict it to standard Scheme, because letting you access everything psyntax can access a) is not Scheme and b) restricts our future implementation choices. If psyntax accesses more than Scheme can access while doing a task that is worth doing, what chance would there be in giving Scheme the power to access what psyntax can? Let me explain what I mean more. Here is a code example: (let ((x 3)) (let ((x (* x 5))) (the-environment))) At the point of the-environment, there is one Scheme variable around, 'x', and its value is 15. The old value of x is shadowed by the new one. But psyntax works by assigning each of these x values a gensym, and they get different gensyms. So in theory, you could access the outer value of x (x = 3) if you had a handle on the right gensym. Supporting this seems weird to me, because I view psyntax as an implementation detail of Guile, and not something that should affect programmers. I also think that not supporting it is fine because if you want to be able to access both values, you can always give your variables different names. So this isn't giving you more power, but just a weird way to access it. If exporting enough of the compiler environment to be useful for implementing multiple compilation units sharing lexical environments is feasible, what are the implications for splitting a compilation into separate units when the extent of psyntax is not similarly involved? In short: where should one draw the line in a way that makes best use of already done compilations without having to redo too much to be of practical use? I do not know enough to respond to the rest of your email. Noah
Re: [Guile-commits] GNU Guile branch, wip-compiler, updated. v2.1.0-139-gf8a333e
Hello, Sorry it's been so long since my last reply. I've been somewhat busy around the holidays, but I hope to work more soon. Also, what about “vset” or just “set” instead of “value-set”? Yes, probably a good idea. + (set! (a-verify-exps ret) + (map (lambda (x) (rec ret x env)) args)) + ret)) Please privilege a functional style, as much as possible (in some cases we lack the tools to do better, so that’s fine.) I don't think there's a functional way to do this, although I could be wrong. What I want to do is make a structure where the child tree nodes are linked to their parents. I can't get the address of the parent until after I allocate it, so I allocate it with an empty list of children, then use its address to make the list of children that I want. Another approach would be to make the list of children before allocating the parent, but then I would have to go back and fix up their 'parent' pointers. So I think either way I'm stuck doing some mutation. +(pass-if value-set-can-be-anything? + (value-set-can-be-anything? anything)) In Emacs you can (put 'pass-if 'scheme-indent-function 1) to get the “right” indentation. Ah, thanks. Done. Noah
Re: Guile: What's wrong with this?
Hello, Then it turned out that the string functions would now clear the high order bit on strings, so they are no longer byte arrays and there is no replacement but to roll my own. I stopped supporting byte arrays. A noticable nuisance. This is just a side note to the main discussion, but there is now a 'bytevector' datatype you can use. Does that work for you? If not, what functionality is missing? Thanks, Noah
Re: Continuation sets and order-independency
Let me see if I understand what you mean. I think you're talking about an expression like this: (cons (call/cc store-this-continuation) (call/cc store-this-continuation)) and you want a way to distinguish the first and the second call/cc, by guaranteeing the order they are hit. This will let you choose which value to put in the car and which to put in the cdr. Is that right? Noah On Wed, Jan 4, 2012 at 8:16 AM, David Kastrup d...@gnu.org wrote: Hi, I was just wondering about the ability for using multiple continuations in contexts that don't guarantee an order of execution. Functions like map, list and other structure builders. If one uses those for building a structure, and some paths of execution hit a call-with-current-continuation, is it feasible to have Guile refrain from actually taking the continuation before all other paths of execution have finished or reached a call-with-current-continuation as well? Meaning that if one takes the whole set of continuations as a template for creating a data structure from values that are not known at the time of building the template, that one can fill in the whole set with actual values/actions/calculations, and have only the actions depending on those variables be redone. Sort of like constant expression folding (some sort of quasi-quoting superset) on steroids. -- David Kastrup
Re: Guile: What's wrong with this?
Hello all, I must admit that I do not know much about why R5RS says that literals are constant, but I think there is a misunderstanding. Bruce does not want `define' to always copy its result. I think what he wants is for literals embedded in source code to be mutable. This would, of course, imply that each literal in the source code would be a new copy, even if they were identical. Weirdly enough, that is how my intuition works too. After all, if I made a string object in Scheme without going to any trouble, I would get a mutable object. If I write down a string, I expect to get the same sort of object. Bruce is also right that this enables quick and easy programming that munges strings. And I think the argument about putting strings in constant memory is bad - constant memory is an implementation detail. If it happens that we can store literals more efficiently when they are not mutated, then perhaps we should just detect that case and switch representations. Of course there is a trade-off here between ease of implementation and ease of use. This change seems pretty unimportant to me, especially if Python does all right with immutable strings, so I do not think it's important for us to support it. I just don't buy the arguments against supporting it. Noah On Thu, Jan 5, 2012 at 8:41 PM, Mark H Weaver m...@netris.org wrote: Mike Gran spk...@yahoo.com writes: It is curious that action of 'copy' really means the action of 'create a copy with different properties'. Shouldn't (string-copy a) create another immutable string? Why would you want to copy an immutable string? Likewise, shouldn't (substring abc 1) return an immutable substring? As I understand it, in the Scheme standards (at least before R6RS's immutable pairs) the rationale behind marking literal constants as immutable is solely to avoid needlessly making copies of those literals, while flagging accidental attempts to modify them, since that is almost certainly a mistake. If that is the only rationale for marking things read-only, then there's no reason to mark copies read-only. The philosophy of Scheme (at least before R6RS) was clearly to make almost all data structures mutable. Following that philosophy, in Guile, even though (substring abc 1) postpones copying the string buffer, it must create a new heap object. Once you've done that, it is feasible to implement copy-on-write. Now, the immutable pairs of R6RS and Racket have an entirely different rationale, namely that they enable vastly more effective optimization in a compiler. In this case, presumably you'd want copies to retain the immutability. Mark
Re: Compiler Branch
Hello, I'm going to try to combine the two parallel compiler threads into one, by replying to two messages at once. I hope this makes things simpler rather than more confusing. :-) Message 1: Here is my calculus regarding this work, FWIW: it is low risk, and low cost, since it doesn't actually affect Tree-IL or any other part of the compiler. The only cost it imposes is a coupling with Tree-IL, which is easy enough to deal with on that side. So that makes me much less uneasy :-) And on the positive side, it has the potential to prove interesting things, so at this point I'm neutral about it. If it does something useful, IMO we should throw it in, when you're ready for that. Other people's opinions are welcome as well, of course. Well, I would hope to have it merged eventually. It will be a while before it's ready to do that, though. Can you describe what the branch does, currently? So far it doesn't do any interesting analysis - it just has bookkeeping code and a very rough interface to an analyzer. The function called 'go' runs everything. You give it a Scheme expression. It compiles that expression to tree-il, then converts it to annotated-tree-il. Then it scans the annotated-tree-il looking for instances of the special function 'verify'. The idea of 'verify' is that if the analyzer can't prove that things in a verify always evaluate to true, it will throw a warning. It's a simple way to mark up your code with your expectations. For instance, (verify #t) always passes verification, and (verify #f) never passes. (verify (some-big-expression)) might or might not pass, depending on the expression and depending on how well we could analyze the program. I imagine that it would primarily be used for things like this: (define (map proc list) (verify (list? list)) ...) If your program didn't pass the verify check, you could still run it, but you probably shouldn't. What do you see it doing within three months or so? Verifying statements, but with some actual inference. How long do you intend to work on it? At least a year, and probably more if that's what it takes to make it do what I want. I would really like to make it usable and useful. What do you think about the tree-il differences in master relative to stable-2.0? I don't know what the differences are, since I just base all of the work on master. Do you see this work as an optional pass, or a core part of the compiler? If the latter, what sort of algorithmic complexity are you envisioning for this work? (O(n) in size of program is ideal of course.) My first idea was to implement something equivalent to 0-CFA, which unfortunately has complexity O(n^3). If there's something that's faster and still produces useful results, that could be a good first step. However, I also think we could get the average-case time far below n^3 by doing inference on demand instead of calculating the type of every binding, similar to the change that peval went through a couple months ago. I think the complexity really determines how it could be used in the compiler. Ideally it would be very fast, and it could work as an extension to peval. If it's slower, it could only be used if the user requested that the compiler do more work. Either way, I'd like to see it help generate native code, and ideally native binaries. Message 2: This sounds cool. I assume you're familiar with kCFA? See http://matt.might.net/articles/implementation-of-kcfa-and-0cfa/, for example. No, I hadn't read about it before. Thank you very much for the pointer! I admit that I am new to writing real compilers, so pointers to papers are great. I have now read the first part of Matt Might's thesis, where he describes k-CFA. 0-CFA is equivalent to what I was planning to do, I think, but his implementation is more elegant than what I had thought of. k = 1 seems to have performance too slow to be useful to us. I also think k = 1 does some wasted work in many common cases. I have ideas on how to fix this, but I suspect they're equivalent to the abstract garbage collection he talks about. In any case, I think 0-CFA is a good first step, and later I hope to get some variant of 1-CFA that does less work than full 1-CFA. It doesn't seem to me that static analysis is a prerequisite for AOT compilation -- and indeed, the current .go compilation is an example of naive AOT compilation. Yes, excellent point. I was blurring two ideas together. I would eventually like this work to lead to an optimizing native-code compiler, so I am planning ahead for that. As a side note, you'll see that the current code is quite ugly in many ways. I used to think that the best choice was to do the analysis in a form very similar to Tree-IL, to keep from disturbing the program structure too much. Now I suspect that it would be best to use something like continuation-passing style, to keep the analyzer simpler. So I hope that you'll see more elegant-looking code soon.
Re: Continuation sets and order-independency
Okay, let me see if this is right now. In the expression (list (call-with-current-continuation func) (+ 4 14)), you want the addition to be done before the call-with-current-continuation, as opposed to being part of the continuation. Right? Noah On Wed, Jan 4, 2012 at 9:00 AM, David Kastrup d...@gnu.org wrote: Noah Lavine noah.b.lav...@gmail.com writes: On Wed, Jan 4, 2012 at 8:16 AM, David Kastrup d...@gnu.org wrote: Hi, I was just wondering about the ability for using multiple continuations in contexts that don't guarantee an order of execution. Functions like map, list and other structure builders. If one uses those for building a structure, and some paths of execution hit a call-with-current-continuation, is it feasible to have Guile refrain from actually taking the continuation before all other paths of execution have finished or reached a call-with-current-continuation as well? Meaning that if one takes the whole set of continuations as a template for creating a data structure from values that are not known at the time of building the template, that one can fill in the whole set with actual values/actions/calculations, and have only the actions depending on those variables be redone. Sort of like constant expression folding (some sort of quasi-quoting superset) on steroids. Let me see if I understand what you mean. I think you're talking about an expression like this: (cons (call/cc store-this-continuation) (call/cc store-this-continuation)) and you want a way to distinguish the first and the second call/cc, by guaranteeing the order they are hit. This will let you choose which value to put in the car and which to put in the cdr. Is that right? No. I exactly _don't_ want to have an order dependency. The end application is more or less supposed to be similar to lambda with everything not depending on a function parameter being preexecuted. When doing sequential operations, the preexecution will stop at the point in sequence where call/cc is hit the first time. But when doing parallelizable operations without a guaranteed order of operations, one could let the preexecution continue on independent operations until _every_ parallelizable operation has reached the point of call/cc. -- David Kastrup
Re: Compiler Branch
Hello, Interesting. `verify' seems to be a form of contracts: http://ftp.ccs.northeastern.edu/scheme/pubs/icfp2002-ff.pdf Does `verify' have runtime semantics? Under what situations, if any, would the compiler insert runtime checks? It has no runtime semantics right now. I considered making it like 'assert', but I'm not sure that's right. I will look at that paper. As that paper indicates, two issues you will have to deal with are higher-order functions and blame. Your interest in static analysis naturally raises the question of types. You might like this paper: http://www.ccs.neu.edu/racket/pubs/dls06-tf.pdf I will look at that too; thank you. Ah, I was just curious. I made some small changes relative to stable-2.0 (primcall and seq), and wondered if they were a good idea or not. I was also considering a move to a CPS-based intermediate language. Some links are here: http://wingolog.org/archives/2011/07/12/static-single-assignment-for-functional-programmers Oh, this is interesting. I was just wondering if I needed a CPS-type representation to write the analyzer reasonably elegantly. If you think the main compiler also needs it, then perhaps I should work on that first, and then come back to the analyzer question. I do think there's a problem with plain CPS, though - it forces you to pick an order for the evaluation of function arguments. I would like to use CPS with some sort of parallel-call operator, so we can leave the order undefined (maybe at some point an optimizer will want to adjust the order). What do you think? I also noticed that at the end of that blog post you said you were considering ANF versus CPS for Guile (I assume you'd already decided that you didn't like Tree-IL). Does this mean you decided on CPS? My first idea was to implement something equivalent to 0-CFA, which unfortunately has complexity O(n^3). If there's something that's faster and still produces useful results, that could be a good first step. However, I also think we could get the average-case time far below n^3 by doing inference on demand instead of calculating the type of every binding, similar to the change that peval went through a couple months ago. Yes, this is my thought as well. Note also that peval is described by waddell and dybvig as being a kind of special-purpose sub-0CFA. That makes sense. What I'd *really* like to do is make the analyzer use the same on-demand-calculation infrastructure as peval, but it might be really tricky to make them fit together. I am planning to leave that project for much later. Noah
Re: Names for PEG Functions
Sorry for the delay. I haven't thought about the PEG stuff in a long time, but looking back, I'm pretty sure I didn't change the names yet. I will try to do it tonight (in GMT-5). I agree, it would be great to have the PEG stuff finished. On Thu, Jan 19, 2012 at 4:53 AM, Andy Wingo wi...@pobox.com wrote: On Wed 04 Jan 2012 19:12, Andy Wingo wi...@pobox.com writes: On Mon 03 Oct 2011 20:21, Noah Lavine noah.b.lav...@gmail.com writes: I hate to make more work for people, but I think the PEG module is almost ready for merging, and could probably be merged if we resolved this names issue. Any other thoughts? Have you updated the wip-peg branch? I don't remember who is waiting for whom any more, but reviewing the threads, it seems that we generally agreed on some name changes here. A kind ping here :) It would be lovely to get the peg stuff in for 2.0.5. Andy -- http://wingolog.org/
Re: impressions on gc
Hello, As long as we're pinging people for 2.0.5, I don't think this patch ever got pushed. :-) I can't build master right now. This is partly my fault for doing so little sysadmin work that I still have libgc 7.1, but I still think this one should really, really be in 2.0.5 if the GC changes will also be there. Noah On Thu, Dec 8, 2011 at 3:10 PM, Andy Wingo wi...@pobox.com wrote: On Thu 08 Dec 2011 16:46, Chris K. Jester-Young cky...@gmail.com writes: Alas, part of that diff also breaks libgc 7.1 (which is the version that Debian currently has packaged), which lack the unmapping functions. Here's my diff to make it work again. Thanks for the patch! In the future a git-format-patch attachment is easiest. Though, if you like, we can set you up with commit access, I think. Request to be added to the `guile' group on savannah and we'll approve. I'll push this one though. Regards, Andy -- http://wingolog.org/
Re: Names for PEG Functions
I've run into trouble because of my problems building master. I'll have to work around that, so it won't happen tonight. On Thu, Jan 19, 2012 at 8:54 AM, Noah Lavine noah.b.lav...@gmail.com wrote: Sorry for the delay. I haven't thought about the PEG stuff in a long time, but looking back, I'm pretty sure I didn't change the names yet. I will try to do it tonight (in GMT-5). I agree, it would be great to have the PEG stuff finished. On Thu, Jan 19, 2012 at 4:53 AM, Andy Wingo wi...@pobox.com wrote: On Wed 04 Jan 2012 19:12, Andy Wingo wi...@pobox.com writes: On Mon 03 Oct 2011 20:21, Noah Lavine noah.b.lav...@gmail.com writes: I hate to make more work for people, but I think the PEG module is almost ready for merging, and could probably be merged if we resolved this names issue. Any other thoughts? Have you updated the wip-peg branch? I don't remember who is waiting for whom any more, but reviewing the threads, it seems that we generally agreed on some name changes here. A kind ping here :) It would be lovely to get the peg stuff in for 2.0.5. Andy -- http://wingolog.org/
Re: Names for PEG Functions
I did the renames. I had to push them as a new branch, 'wip-peg-fixed-2', because I also rebased on a more recent version of master to avoid some build problems. However, the peg patches are independent of everything that's been happening in master and stable-2.0, so they should apply to both branches cleanly (except possibly for merging the makefiles). The new names are: define-peg-pattern: define a nonterminal from an s-expression define-peg-string-patterns: define many nonterminals from a string match-pattern: see if a nonterminal matches a string search-for-pattern: see if a nonterminal matches a string, starting at each successive offset until a match is found or you reach the end compile-peg-pattern: turn an s-expression or a string into syntax which defines a parser for the given pattern The only weirdness I can see is that we don't really need two define-peg-* things, because the macro could just detect whether it got a string or an s-expression and do the right thing. I can fix that if other people think it should be done. Noah On Thu, Jan 19, 2012 at 10:18 PM, Noah Lavine noah.b.lav...@gmail.com wrote: I've run into trouble because of my problems building master. I'll have to work around that, so it won't happen tonight. On Thu, Jan 19, 2012 at 8:54 AM, Noah Lavine noah.b.lav...@gmail.com wrote: Sorry for the delay. I haven't thought about the PEG stuff in a long time, but looking back, I'm pretty sure I didn't change the names yet. I will try to do it tonight (in GMT-5). I agree, it would be great to have the PEG stuff finished. On Thu, Jan 19, 2012 at 4:53 AM, Andy Wingo wi...@pobox.com wrote: On Wed 04 Jan 2012 19:12, Andy Wingo wi...@pobox.com writes: On Mon 03 Oct 2011 20:21, Noah Lavine noah.b.lav...@gmail.com writes: I hate to make more work for people, but I think the PEG module is almost ready for merging, and could probably be merged if we resolved this names issue. Any other thoughts? Have you updated the wip-peg branch? I don't remember who is waiting for whom any more, but reviewing the threads, it seems that we generally agreed on some name changes here. A kind ping here :) It would be lovely to get the peg stuff in for 2.0.5. Andy -- http://wingolog.org/
Re: syntax-local-binding
Hello, If we can already foresee the need to deprecate an interface, wouldn't it be better not to add it in the first place? I don't see the need to deprecate them now, not more than any other identifier that we export. I think this may be the key to this argument. There are two separate questions being debated as one question. Here they are: 1. Do we forsee a need to deprecate the syntax-local-binding functionality in the future, for instance to move away from psyntax? 2. Is this version of syntax-local-binding the best interface to this functionality? They are related questions, but they are distinct because someone could believe that it is good to expose this functionality but that the current syntax-local-bindings is not the best interface for us. I could be wrong, but I think that Andy answers maybe, but not for a long time to 1, and therefore thinks it's fine to include syntax-local-binding. Mark answers maybe, but definitely needs more thought before we make a commitment to 2, and therefore does not want to include syntax-local-binding. These are not contradictory positions. (Some of their other positions are contradictory, though :-) ). However, it does make me think that we should discuss the interface to syntax-local-binding more before releasing it (but don't take this too seriously, because I didn't follow the earlier threads much). Noah
Build Error in master
Hello, I just checked out the latest master. I attempted to build it, but got stuck at this point: GENguile-procedures.texi /bin/sh: line 1: 75746 Broken pipe cat alist.doc arbiters.doc array-handle.doc array-map.doc arrays.doc async.doc backtrace.doc boolean.doc bitvectors.doc bytevectors.doc chars.doc control.doc continuations.doc debug.doc deprecated.doc deprecation.doc dynl.doc dynwind.doc eq.doc error.doc eval.doc evalext.doc expand.doc extensions.doc feature.doc filesys.doc fluids.doc foreign.doc fports.doc gc-malloc.doc gc.doc gettext.doc generalized-arrays.doc generalized-vectors.doc goops.doc gsubr.doc guardians.doc hash.doc hashtab.doc hooks.doc i18n.doc init.doc ioext.doc keywords.doc list.doc load.doc macros.doc mallocs.doc memoize.doc modules.doc numbers.doc objprop.doc options.doc pairs.doc ports.doc print.doc procprop.doc procs.doc promises.doc r6rs-ports.doc random.doc rdelim.doc read.doc root.doc rw.doc scmsigs.doc script.doc simpos.doc smob.doc sort.doc srcprop.doc srfi-1.doc srfi-4.doc srfi-13.doc srfi-14.doc srfi-60.doc stackchk.doc stacks.doc stime.doc strings.doc strorder.doc strports.doc struct.doc symbols.doc threads.doc throw.doc trees.doc uniform.doc values.doc variable.doc vectors.doc version.doc vports.doc weak-set.doc weak-table.doc weak-vector.doc dynl.doc posix.doc net_db.doc socket.doc regex-posix.doc 75747 Segmentation fault | GUILE_AUTO_COMPILE=0 ../meta/uninstalled-env guild snarf-check-and-output-texi guile-procedures.texi So I verified that meta/guile also produced a segmentation fault, rebuilt with debugging enabled, and then did meta/gdb-uninstalled-guile. Here is the backtrace I got at the segfault: Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0x030c 0x7fff8707e257 in pthread_mutex_lock () (gdb) bt #0 0x7fff8707e257 in pthread_mutex_lock () #1 0x0001000f2839 in scm_c_weak_set_add_x (set=0x10101cff0, raw_hash=2821877404945499297, pred=0x1000be04b symbol_lookup_predicate_fn, closure=0x10101efc0, obj=0x10101efc0) at weak-set.c:758 #2 0x0001000be1f7 in scm_i_str2symbol (str=0x10101efe0) at symbols.c:254 #3 0x0001000be8dc in scm_from_latin1_symboln (sym=0x100122478 vm-run, len=6) at symbols.c:516 #4 0x0001000be836 in scm_from_latin1_symbol (sym=0x100122478 vm-run) at symbols.c:499 #5 0x0001000f071b in scm_bootstrap_vm () at vm.c:888 #6 0x00010004872b in scm_i_init_guile (base=0x7fff5fbfef90) at init.c:396 #7 0x0001000bfe79 in scm_i_init_thread_for_guile (base=0x7fff5fbfef90, parent=0x0) at threads.c:824 #8 0x0001000bff80 in with_guile_and_parent (base=0x7fff5fbfef90, data=value temporarily unavailable, due to optimizations) at threads.c:890 #9 0x000100330e09 in GC_call_with_stack_base () #10 0x0001000c00d2 in scm_i_with_guile_and_parent (func=0x1000485dd invoke_main_func, data=0x7fff5fbff040, parent=0x0) at threads.c:940 #11 0x0001000c00fe in scm_with_guile (func=0x1000485dd invoke_main_func, data=0x7fff5fbff040) at threads.c:946 #12 0x0001000485be in scm_boot_guile (argc=1, argv=0x7fff5fbff0b8, main_func=0x10e46 inner_main, closure=0x0) at init.c:318 #13 0x00010edb in main (argc=1, argv=0x7fff5fbff0b8) at guile.c:81 Does anyone know what might have caused this? Thanks, Noah
Re: Build Error in master
I don't think so, because I don't get Throw to key misc-error with args (module-transformer no module, and `macroexpand' unbound () #f). I'm going to try a git bisect and see where this appears. Noah On Thu, Jan 26, 2012 at 5:06 PM, Andy Wingo wi...@pobox.com wrote: On Thu 26 Jan 2012 15:26, Noah Lavine noah.b.lav...@gmail.com writes: I just checked out the latest master. I attempted to build it, but got stuck at this point: Is it http://lists.gnu.org/archive/html/bug-guile/2011-12/msg00058.html ? Andy -- http://wingolog.org/
Re: unknown location: definition in expression context in subform optname-from of _^
Hello, /* Read expressions from that port; ignore the values. */ for (;;) { SCM form = scm_read(port); if (SCM_EOF_OBJECT_P(form)) break; ans = scm_primitive_eval_x(form); } return ans; } } Every evaluation comes from here and in this instance, pzFile pointed to /path/to/aginfo.tpl and line was set to 163. Therefore, the location should *NOT* have been unknown and in fact, the displayed line number ought to have been 171. Why that failed I would very much want to know so that I can fix this ag_scm_c_eval_string_from_file_line() thingy. I am not an expert, but this is my guess about what's happening: you get the form from the file with scm_read. scm_read returns a regular s-expression, not a syntax object. Then you pass this form to scm_primitive_eval_x. scm_primitive_eval_x has a plain syntax-object with no way to know where it came from, so it prints unknown location. The reason the standard Guile interpreter knows where things came from is that it parses files as syntax, not s-expressions, and syntax objects have embedded file and line information. But note that I didn't say that you could do this. I don't know of an interface to this functionality (although I believe this example shows that we need one). Noah
Re: Build Error in master
Got it! And unfortunately, it's a GC error. Here's what happens: symbols is an SCM object defined in symbols.c. It points to an scm_cell_t which has two elements: a type tag, and a pointer to an scm_weak_set_t. That scm_cell_t is at 0x10101cff0. However, that scm_cell_t is garbage collected in the scm_cons at symbols.c:250. The reason it gets filled with SCM_EOL is that the cons is scm_cons (SCM_BOOL_F, SCM_EOL). So I expect that the scm_cell_t is garbage collected and then immediately reclaimed to be the pair, which would result in its second cell being filled with SCM_EOL, which would explain why we later extract SCM_EOL from it. As a test, I changed the variable 'symbols' in symbols.c to be non-static. That didn't fix it, but then again, I don't really know how GC works yet. I can read the libgc documentation and try to figure this out, but can anyone point me to what I should be looking for? Thanks, Noah On Wed, Feb 1, 2012 at 4:01 AM, Andy Wingo wi...@pobox.com wrote: On Wed 01 Feb 2012 03:12, Noah Lavine noah.b.lav...@gmail.com writes: In the failing call, the SCM 'symbols' is 0x10101cff0, but the failing set is at 0x304, which has not been allocated. 0x304 is one of the iflags, SCM_EOL I think. So, I know it might not have anything to do with it, but can you verify that your guile includes patch 0aed71aa51e89e714de2392c2a5f44694dca77ea ? I just committed that last night, and although it does not seem to be related, who knows. Andy -- http://wingolog.org/
Re: Build Error in master
Hello, I inserted a GC_is_visible check in my code and learned that 'symbols' is visible, but the things it points to were getting garbage collected anyway. It seemed like a GC bug, so I'm trying to build Guile with the latest version of GC and hoping that fixes it. I just wanted to warn everyone that the GC_get_free_space_divisor function will break with libgc 7.2alpha6, because libgc includes the same function with the same name. I'm not sure what to do about that - probably try to automatically discover it in configure and put more #ifdefs around our definition (gc.c:212). Noah On Thu, Feb 2, 2012 at 9:59 PM, Noah Lavine noah.b.lav...@gmail.com wrote: Got it! And unfortunately, it's a GC error. Here's what happens: symbols is an SCM object defined in symbols.c. It points to an scm_cell_t which has two elements: a type tag, and a pointer to an scm_weak_set_t. That scm_cell_t is at 0x10101cff0. However, that scm_cell_t is garbage collected in the scm_cons at symbols.c:250. The reason it gets filled with SCM_EOL is that the cons is scm_cons (SCM_BOOL_F, SCM_EOL). So I expect that the scm_cell_t is garbage collected and then immediately reclaimed to be the pair, which would result in its second cell being filled with SCM_EOL, which would explain why we later extract SCM_EOL from it. As a test, I changed the variable 'symbols' in symbols.c to be non-static. That didn't fix it, but then again, I don't really know how GC works yet. I can read the libgc documentation and try to figure this out, but can anyone point me to what I should be looking for? Thanks, Noah On Wed, Feb 1, 2012 at 4:01 AM, Andy Wingo wi...@pobox.com wrote: On Wed 01 Feb 2012 03:12, Noah Lavine noah.b.lav...@gmail.com writes: In the failing call, the SCM 'symbols' is 0x10101cff0, but the failing set is at 0x304, which has not been allocated. 0x304 is one of the iflags, SCM_EOL I think. So, I know it might not have anything to do with it, but can you verify that your guile includes patch 0aed71aa51e89e714de2392c2a5f44694dca77ea ? I just committed that last night, and although it does not seem to be related, who knows. Andy -- http://wingolog.org/
Re: Build Error in master
Oh, and the same will happen with GC_get_suspend_signal, which we define at scmsigs.c:155. Other than that, though, the new gc resolves the problem I had. I am willing to chalk this up to a GC bug and not worry about it more. Noah On Tue, Feb 7, 2012 at 9:07 AM, Noah Lavine noah.b.lav...@gmail.com wrote: Hello, I inserted a GC_is_visible check in my code and learned that 'symbols' is visible, but the things it points to were getting garbage collected anyway. It seemed like a GC bug, so I'm trying to build Guile with the latest version of GC and hoping that fixes it. I just wanted to warn everyone that the GC_get_free_space_divisor function will break with libgc 7.2alpha6, because libgc includes the same function with the same name. I'm not sure what to do about that - probably try to automatically discover it in configure and put more #ifdefs around our definition (gc.c:212). Noah On Thu, Feb 2, 2012 at 9:59 PM, Noah Lavine noah.b.lav...@gmail.com wrote: Got it! And unfortunately, it's a GC error. Here's what happens: symbols is an SCM object defined in symbols.c. It points to an scm_cell_t which has two elements: a type tag, and a pointer to an scm_weak_set_t. That scm_cell_t is at 0x10101cff0. However, that scm_cell_t is garbage collected in the scm_cons at symbols.c:250. The reason it gets filled with SCM_EOL is that the cons is scm_cons (SCM_BOOL_F, SCM_EOL). So I expect that the scm_cell_t is garbage collected and then immediately reclaimed to be the pair, which would result in its second cell being filled with SCM_EOL, which would explain why we later extract SCM_EOL from it. As a test, I changed the variable 'symbols' in symbols.c to be non-static. That didn't fix it, but then again, I don't really know how GC works yet. I can read the libgc documentation and try to figure this out, but can anyone point me to what I should be looking for? Thanks, Noah On Wed, Feb 1, 2012 at 4:01 AM, Andy Wingo wi...@pobox.com wrote: On Wed 01 Feb 2012 03:12, Noah Lavine noah.b.lav...@gmail.com writes: In the failing call, the SCM 'symbols' is 0x10101cff0, but the failing set is at 0x304, which has not been allocated. 0x304 is one of the iflags, SCM_EOL I think. So, I know it might not have anything to do with it, but can you verify that your guile includes patch 0aed71aa51e89e714de2392c2a5f44694dca77ea ? I just committed that last night, and although it does not seem to be related, who knows. Andy -- http://wingolog.org/
Re: GNU Guile PEG-parser
Hello, Thanks for emailing! I suppose I am the one to talk to, since I was the last one to work on it. I didn't make the PEG parsing syntax, but I would guess the reason there isn't a string syntax for ignore is that there's no conventional way to write it, but there is for the other PEG elements. It would be easy to add one if it was useful, but we'd want to make sure our syntax agreed with other PEG libraries, so people wouldn't be confused later. For blank-space indented blocks, do you mean you want to group together lines with the same indentation, like Python syntax? If you know what the indentation will be at the beginning of each line, you can do something like this: (* (and \t match-line \n)), where you replace \t with whatever indentation you want. However, what you probably want to do is look at the indentation in the first line and then group it with every following line that has the same indentation. I'm not sure if it's possible, but it would probably be ugly. If you tell me what you're trying to do, though, I can help you write your own parser to handle it. You can even write some of your parser yourself and use PEGs for the rest, if you're willing to use PEG internals. Can you tell me more about what you're trying to do? I am happy to help now, but I will be more helpful if I know more. I'm going to CC the guile-devel mailing list because of the issue with the string syntax. Noah On Tue, Feb 7, 2012 at 10:03 AM, Krister Svanlund krister.svanl...@gmail.com wrote: Hi, I'm currently involved in a project that plans on using the PEG module for Guile for parsing and I've understod that you are the one to talk to about it. I'm mostly just curious how come there isn't an equivalent to ignore in string-patterns and if this would be complex to add? I'm also curious if there is any way to deal with blank-space indented blocks in PEG. Yours Krister Svanlund
Re: GNU Guile PEG-parser
Hello, I've actually found no PEG library that has a string syntax for the equivalent of ignore. I'm guessing most people are satisfied with just specifying another nonterminal and matching that one. Probably because it is seen as less ugly than extending on the formal definition of PEG but I really think we could get a cleaner PEG definition of our parser if we where able to ignore text that wasn't needed or gets in the way while using string-patterns. That makes sense. I'm a bit surprised that you find string patterns easier than S-expression patterns, because I find it the other way around, but different people like different things. I think we could add some string syntax for ignore if you wanted it, although other people on the list should chime in. It's actually exactly Python I'm thinking about, we are currently doing a preprocessor that will put #{ and #} before and after each block but I was hoping that there exists a cleaner solution using the power of PEG instead of basic string manipulation. If you could help in any way shape or form that would be greatly appreciated, even just suggesting on what parts of PEG internals to look at would be really useful. After thinking about it more, you have two choices. The easiest thing would be to parse each line (or extended line, if it ends with \) with a PEG parser, and use your own logic for the blocks. Your parser would have two steps for each line: 1. Get the indent from the beginning of a line 2. Parse the rest of the line with a PEG parser Then you would take the lines returned by the PEG parser and combine them into a data structure yourself, using the Python line-combining rules. This is probably your best choice. Your second choice depends on the fact that PEG parsers are just functions that take certain arguments and return certain arguments. You can write a function like that yourself and use it just like a PEG nonterminal in your grammar. When I was working on PEG, I actually thought that it would be nice to make this interface public so that different parser generators could interoperate, but I never did it. It's all documented in the PEG Internals section of the manual, though. However, I'd recommend against this just because I think the interface is not as good as it should be right now, so I'd probably want to change it in the future, which would make your code stop working. (Although if this is a one-time thing, then you don't need to care about that.) I suppose you also have a third choice, which is to change the internal interface yourself, then let us make it public, then use it that way. That's the most elegant solution, but it's more work for you. I wouldn't recommend it unless the first option is hard and you want this to last for a long time. I hope this helps, Noah
Adding Identities to Peval
Hello, I've been working on a patch to add a new sort of optimization to peval, and I think it's almost ready. It's based on some of the ideas in Environment Analysis of Higher-Order Languages. The goal is to recognize when two quantities are equal even when we don't know what they are. My working example has been this expression: (let* ((x (random)) (y x)) (eq? x y)) The patch attached to this message lets peval optimize that to (begin (random) #t) This happens through objects called 'identities'. We make a new identity for every constant and the result of every procedure call. Identities are associated with values, not variables, so x and y above have the same identity. When it looks at the eq?, peval notices that x and y have the same identity, and concludes that they must be eq? even without knowing their value. The patch adds some members to the operand record type to store an identity for the operand, and an identity-visit function that matches the visit function used to get values. It also has a few utility functions and the definition of a new record type identity with no members. The logic added is pretty minimal. I tested it by running check-guile. It passes about the same number of tests as it did before I added the patch. There's one glaring wart. The identity checking is activiated by calls to (toplevel 'eq?). Clearly, it should be activated by calls to the primitive 'eq? (and eqv? and equal?). The reason it's not is that my example above compiles to (call (toplevel-ref 'eq?) ...), and I don't know how to make it turn into a (primcall 'eq? ...). I tried a few variants, but they didn't work. What test code should I be using to produce a primcall? However, that problem is easy to fix, and besides that I believe the patch is correct. This also relates to my earlier ideas for a compiler. After thinking about it more, I'm not sure I like the direction I took in the wip-compiler branch, so I decided to start working on peval instead. This patch has an ulterior motive - besides adding a new optimization itself, it uses a lot of the infrastructure you'd want for more aggressive optimization like type-checking. So part of the purpose of this was to learn how to do that. What do you think? Noah 0001-Add-Identity-Optimization.patch Description: Binary data
Re: Adding Identities to Peval
Hello, Note, we don't need to add extra identities to operands: they already have their gensyms. So to get the effect of this patch, you could add a clause to fold-constants: ((primcall src 'eq? (lexical _ _ x) (lexical _ _ y)) (if (eq? x y) (make-const src #t) keep-original-expression)) This works because peval already turns it into the following, after processing the operands for value: (let ((x (random))) (eq? x x)) Ah, that is a cleaner way to do it. But ideally I would like identities to be things that we can store in lists, so that (let* ((x (random)) (y (list x)) (z (car y)) (eq? x z)) optimizes to (begin (random) #t). That requires more infrastructure, though - probably some sort of struct representing a value, with parts for the constant value and the identity, that gets passed around in peval. Also, David pointed out that this doesn't have many uses. That is probably true. :-) I actually did it because it's a simpler version of type-checking, which *does* have many uses. In order to type-check, you need to be able to associate extra information with a value (its type) even in cases where you don't know the value. Identities work the same way, except that the information is simpler. So my ulterior motive in doing this is to learn how to do that, and set up whatever infrastructure I need for it. Noah
Re: Adding Identities to Peval
Hello, On Thu, Feb 16, 2012 at 10:06 AM, Andy Wingo wi...@pobox.com wrote: On Thu 16 Feb 2012 14:18, Noah Lavine noah.b.lav...@gmail.com writes: (let ((x (random))) (eq? x x)) (let* ((x (random)) (y (list x)) (z (car y)) (eq? x z)) This one should reduce to the same thing, but currently doesn't because (list x) is not considered a constant expression because `let' is a constructor (quotes to indicate the language of peval). If we can propagate it (probably true in this case, given that it has only one use), then the expression folds as before: (let* ((x (random)) (z (car (list x (eq? x z)) = (let ((x-129 (random))) (eq? x-129 x-129))) So I don't think that for this purpose (identity) we need any more mechanism So you're saying that the gensym that comes from psyntax should essentially be the identity marker, because there is one gensym for each value. To make sure I understand, in the x-y-z example, psyntax would produce different gensyms for x and z, but peval could merge them later on, right? In that case, peval would maintain the invariant if two values must always be the same, they have the same gensym. Correct? This seems just a good as what I implemented. I am happy with either way. In order to type-check, you need to be able to associate extra information with a value (its type) even in cases where you don't know the value. For this purpose, have you considered creating a functional database of facts? Then you can add facts when you see conditionals, for example, and you get different facts in the different branches. (if test consequent alternate) ;; assuming the test doesn't fold = (make-if (for-test test current-facts) (for-tail consequent (add-fact test #t current-facts)) (for-tail consequent (add-fact test #f current-facts))) This is an interesting idea, and I hadn't really considered it. I had certainly planned to let the information associated with values change as the environment changes, which would let it use conditional and nonlocal-exit information. But a database goes beyond that. The big difference is that a database could express relationships between values - if information is associated with a value, where does the information x y go? This makes me think that a database is a long-term solution. However, if you're not expressing relationships, then I think the functional database would just amount to keeping an environment mapping value names to information about them. I hadn't thought this through when I made my post, but I realize that this is what I would have to do to let my information-about-values change over time. So three paragraphs later, I think we agree :-). However, what I'd really like to do is use an IR that expresses control flow, as you say below. This is related to range analysis. But, it's also something that's easier to do with an explicit notion of control flow (i.e. a CPS IR). Yes, I think you're right. I would be very interested in working on a CPS IR, but I remember you had a blog post a couple months ago where you said you weren't sure if CPS or ANF was better for Guile. What do you think about intermediate representations now? Noah
Re: Adding Identities to Peval
Here is another patch that fixes the first of my examples. (let* ((x (random)) (y x)) (eq? x y)) now folds to (begin (random) #t). It turned out to be much smaller than the previous approach. Is it all right if I push this? I'm not sure how to make the other one work yet, but I will think about it. I will send a separate email about the CPS stuff. Noah On Fri, Feb 17, 2012 at 3:13 AM, Andy Wingo wi...@pobox.com wrote: Hi Noah, On Fri 17 Feb 2012 03:22, Noah Lavine noah.b.lav...@gmail.com writes: (let* ((x (random)) (y (list x)) (z (car y)) (eq? x z)) To make sure I understand, in the x-y-z example, psyntax would produce different gensyms for x and z, but peval could merge them later on, right? When processing `z' for value, it should reduce to `x'. The strategy is to do computation at compile time -- if at compile time, `z' can reduce to `x', peval will do it. But there is no global give me all the things that are equal to x procedure. In that case, peval would maintain the invariant if two values must always be the same, they have the same gensym. Correct? More like the other way around! If one identifier partially evaluates to another, then they will have the same gensym, and therefore are the same identifier. This is related to range analysis. But, it's also something that's easier to do with an explicit notion of control flow (i.e. a CPS IR). Yes, I think you're right. I would be very interested in working on a CPS IR, but I remember you had a blog post a couple months ago where you said you weren't sure if CPS or ANF was better for Guile. What do you think about intermediate representations now? I have no idea :-) I am a little hesitant to move to either one right now, because our stack machine penalizes named temporaries, and with CPS (or ANF -- which is closer to what we have, but without the advantage of treating control flow on a first-class level) you would get a lot more named temporaries. But my uncertainty is also because I don't know what it would make the compiler look like. Would it simplify things or would it add complication? I don't know. But, writing a pass to turn tree-il into some CPS language would be a really interesting step in any case. If you wanted to work on that, I'm sure it would be instructive to us all. My humble thoughts :) Cheers, Andy -- http://wingolog.org/ 0001-Optimize-Equality-Primitives.patch Description: Binary data
CPS and Guile
Hello everyone, There was some discussion in the Identities in Peval thread about using a CPS-like intermediate representation in Guile. I think that our goal with our intermediate representation should be to make things convenient for us - we shouldn't try to match any particular model unless it helps us, and we should be happy customizing it for ourselves. From this point of view, I see two orthogonal issues with Tree-IL. First, some of the tree-il forms have their own control-flow properties, which doesn't work well with a uniform representation of control flow. They also complicate the big match statements in peval. Why not make dynlet, dynwind, dynref, dynset, prompt, and abort into primitives? Code that cares can still match them, and code that doesn't care can ignore them. Second, the idea of annotating tree-il forms with their continuations. I think that could actually be done fairly simply, if we remove the contraint that they have to be useful :-). I'm imagining that at first continuations would be like source information, in that they are passed everywhere but the compiler doesn't use them. You could even keep a separate hash table of continuations if you really wanted to, but that seems like more effort to me. I think it would be fine to have a hybrid compiler for bit that did some analyses using the continuation information and some using the tree-il structure. That might help with the issue of named primitives, too - we could still generate code from the tree structure. What do you think? Noah
Re: manual examples - self contained or not?
Hello, My vote would be for self-contained: it can be copied directly into a file or REPL and executed, and IMO reduces confusion. I already try to do this when posting any examples on paste.lisp.org or in a gist. When you say self-contained, do you mean that if the example uses a procedure from a module, it should have a (use-modules ...) statement? If so, that sounds good to me. If you mean that most examples should never use procedures that are in modules, that seems a little restrictive. Noah
Re: Trouble using (current-filename)
What about having two bits of syntax, current-filename and current-file-path? Or better yet, current-filename and current-file-directory, with the guarantee that (string-append (current-file-directory) path-separator (current-filename)) points to the file when it was compiled? It is more intuitive to me that (current-filename) would return only the file name, with no other path components, and that I would use some other form to get those. Noah On Sun, Feb 19, 2012 at 4:23 PM, Andy Wingo wi...@pobox.com wrote: On Sun 19 Feb 2012 22:02, l...@gnu.org (Ludovic Courtès) writes: See Neil's use case here: http://thread.gmane.org/gmane.lisp.guile.devel/13440/focus=13621 Can we do something that makes sense for both cases? Well, (add-to-load-path (dirname (canonicalize-path (current-filename ? Yes it’s verbose, but it’s predictable and clearly specified. It's terribly verbose, IMO. If we can do something better for those users, we should. What did you think of my other suggestion about searching for the file in the load path, and only canonicalizing if it was not found in the load path? (If it is found in the load path, then it would be returned in its relative form.) Andy -- http://wingolog.org/
Re: Adding Identities to Peval
Done. I added two tests, for good measure :-). Noah On Sun, Feb 19, 2012 at 4:53 AM, Andy Wingo wi...@pobox.com wrote: On Sat 18 Feb 2012 17:20, Noah Lavine noah.b.lav...@gmail.com writes: Here is another patch that fixes the first of my examples. (let* ((x (random)) (y x)) (eq? x y)) now folds to (begin (random) #t). It turned out to be much smaller than the previous approach. Is it all right if I push this? Looks good to me, if you add a test at the same time :-) Andy -- http://wingolog.org/
Re: Non-stack-copying call-with-current-continuation?
If I understand correctly, you only need to call the continuation once. Is that right? In that case, I think you could use either catch and throw or prompts. (And catch and throw are implemented with prompts, so really, you can just choose how you want to use prompts.) Noah On Thu, Mar 1, 2012 at 7:00 PM, David Kastrup d...@gnu.org wrote: Hi, I am just meddling around with coding and have come up with the following: (define-public (find-child music predicate) Find the first node in @var{music} that satisfies @var{predicate}. (catch 'music-found (lambda () (fold-some-music predicate (lambda (music . _) (throw 'music-found music)) #f music)) (lambda (key music) music))) Now the problem with that is that it is unhygienic. If fold-some-music were to use music-found signals, or if the predicate did, things would be awkward. One would need to work with a uniquely generated symbol. It turns out that the above can be expressed much clearer and cleaner as (define-public (find-child music predicate) Find the first node in @var{music} that satisfies @var{predicate}. (call-with-current-continuation (lambda (music-found) (fold-some-music predicate (lambda (music . _) (music-found music)) #f music at least if I did not make some thinko here. It is basically the same code and stack-upwards-only, but hygienic. Nothing can call the continuation but what is inside. Well, of course fold-some-music could save the closure calling music-found for later. But it doesn't. Is there a way to get a call-with-current-continuation that does not create a stack copy? It is fine if it fails with an exception if one still tries calling the continuation after it has already returned. -- David Kastrup
Re: Non-stack-copying call-with-current-continuation?
Oh yes, you're right. I'm sorry I missed it. I believe you can do it hygienically though. With prompts, you can use (make-prompt-tag) to generate a new, unique tag. With catch and throw, you could use (gensym) to do the same thing. You first example would become something like (define-public (find-child music predicate) Find the first node in @var{music} that satisfies @var{predicate}. (let ((music-found-tag (gensym))) (catch music-found-tag (lambda () (fold-some-music predicate (lambda (music . _) (throw music-found-tag music)) #f music)) (lambda (key music) music))) Does that work? Noah On Thu, Mar 1, 2012 at 7:42 PM, David Kastrup d...@gnu.org wrote: Noah Lavine noah.b.lav...@gmail.com writes: On Thu, Mar 1, 2012 at 7:00 PM, David Kastrup d...@gnu.org wrote: Hi, I am just meddling around with coding and have come up with the following: (define-public (find-child music predicate) Find the first node in @var{music} that satisfies @var{predicate}. (catch 'music-found (lambda () (fold-some-music predicate (lambda (music . _) (throw 'music-found music)) #f music)) (lambda (key music) music))) Now the problem with that is that it is unhygienic. If fold-some-music were to use music-found signals, or if the predicate did, things would be awkward. One would need to work with a uniquely generated symbol. It turns out that the above can be expressed much clearer and cleaner as (define-public (find-child music predicate) Find the first node in @var{music} that satisfies @var{predicate}. (call-with-current-continuation (lambda (music-found) (fold-some-music predicate (lambda (music . _) (music-found music)) #f music at least if I did not make some thinko here. It is basically the same code and stack-upwards-only, but hygienic. Nothing can call the continuation but what is inside. Well, of course fold-some-music could save the closure calling music-found for later. But it doesn't. Is there a way to get a call-with-current-continuation that does not create a stack copy? It is fine if it fails with an exception if one still tries calling the continuation after it has already returned. If I understand correctly, you only need to call the continuation once. Is that right? In that case, I think you could use either catch and throw or prompts. Neither of which are hygienic and also less straightforward. It is not like I did not start my posting by giving the catch/throw-based example and saying what I disliked about it. -- David Kastrup
Re: Non-stack-copying call-with-current-continuation?
Hello, IIRC, the stack copying in Guile's continuation implementation is inevitable. Though it's inefficient, the consideration is to cooperate with other languages such as C. It's inevitable in the general case, where the continuation might return multiple times. However, in this situation, he knows the continuation will only ever return once, so it is not necessary to copy the stack. This control structure will be the equivalent of setjmp and longjmp in C. Prompts have an optimization where they don't actually copy the stack if it is clear that they can only return once. That is why catch and throw are implemented that way. Noah
Re: Non-stack-copying call-with-current-continuation?
Hello, Sure, but things like gensym and make-prompt-tag (and (list '()) for creating an eq?-unique object) are artificial hygiene coming at a cost in symbol table and symbol generation time rather than lexical hygiene. They need _extra_ work, whereas the call-with-current-continuation approach needed _less_ work. Basically I want something like call-with-single-continuation that will only allow one return (and any dynwind out counts and should work if it is the first, so it is not exactly equivalent to using with-continuation-barrier) and come without the stack-copying cost of call-with-current-continuation. I agree that it's not pretty. We have hygienic macros so we don't have to use gensym, after all. But I don't know of a better way. Sure, you can do (define (call-with-single-continuation proc) (let ((tag (gensym))) (catch tag (proc (lambda x (throw tag x))) (lambda (tag x) (apply values x) Oops. (apply values ...)? Does this even work? Anyway, you get the drift. But it is not pretty and it is not primitive. And its failure mode when you disobey the single contract is an uncaught exception with a weird symbol. Well, I can't fix all of that, but you could at least make it fail better like this: (define (call-with-single-continuation proc) (let ((tag (gensym)) (thrown #f)) (catch tag (proc (lambda x (if thrown (error Single continuation called twice) (throw tag x))) (lambda (tag . x) (apply values x) I think that will do what you want. It doesn't look too bad to me, although I agree with your point in general - using symbols to denote names, like prompts, undoes the hygiene that Scheme is supposed to have. But the only solution I can think of is pretty messy. Noah
Re: [PATCH] tree-il-scheme improvements
This is great! Thanks for improving this so much. Noah On Thu, Mar 1, 2012 at 6:40 PM, Mark H Weaver m...@netris.org wrote: I wrote: Here's a significantly refactored version of my 'tree-il-scheme' improvements. and here are the actual patches, with the psyntax-pp.scm portions removed. make -C module ice-9/psyntax-pp.scm.gen to regenerate. Mark
Re: Google Summer of Code 2012
I do not know if that idea is still valid. However, here are two more to add to that list: - Integration with Emacs. Guile has a very-nearly-complete implementation of Elisp. We'd like to get it to the point that it can actually run Emacs, and see if we can implement GNU's editor better than the standard Elisp interpreter. This project will require converting Emacs' C code to use Guile's object system, and possibly working on an Emacs Lisp compiler implemented in Scheme. - Compilation and speed. Guile has a pretty good compiler right now, but we always want more speed. The student could take this in different directions depending on interest. One idea that could take about a summer is to compile Guile to a register virtual machine instead of the current stack VM. I would love to help with either of these, although I am not sure I know enough to be the only mentor for them. Noah On Sun, Mar 4, 2012 at 10:50 AM, Giuseppe Scrivano gscriv...@gnu.org wrote: Hi! I am going trough the list of the ideas for the Google Summer of Code 2011. I am wondering if this list is still valid: http://www.gnu.org/software/soc-projects/ideas-2012.html#guile Otherwise, do you have something to suggest? Thanks! Giuseppe
Re: The dynamic stack
Hello, The “dynwind stack” actually (I misread it the first time.) Yes, it did have this name before. (More often, the wind list.) But since dynwind is overloaded so much (dynamic-wind operator, dynwind, scm_dynwind_*), and the dynamic stack can have other things on it like prompts, I thought it best to give it a new name. I see your point about the old name, but I find the new name confusing. What about the stack that holds regular program variables and temporaries? That is also dynamic, so at first I thought you were referring to that. I didn't know what was going on until this email. I thought of control stack, since the things on it deal with control flow. But that is also confusing, because it doesn't include return addresses and frame pointers. Noah
Build Error in Master
When building the latest git master, I get this error: GUILEC language/tree-il/compile-glil.go Assertion failed: (table-n_items size), function rob_from_rich, file weak-table.c, line 252. Has weak-table.c changed recently? Noah
Re: [PATCH] primitive resolution for public refs
Hello, Are you sure this is correct? What if someone calls the module-set! function on that module between the definition and use of whatever function you're compiling? I agree very strongly that we need to be able to do this sort of optimization, but I think we might need some sort of caching mechanism to do it right, where the cache of compiled function bodies is invalidated by module-set!. Noah On Mon, Mar 12, 2012 at 12:42 PM, BT Templeton b...@hcoop.net wrote: * module/language/tree-il/primitives.scm (resolve-primitives!): Resolve public module-refs to primitives. --- module/language/tree-il/primitives.scm | 16 +--- 1 files changed, 9 insertions(+), 7 deletions(-) diff --git a/module/language/tree-il/primitives.scm b/module/language/tree-il/primitives.scm index 3c6769d..1dcab20 100644 --- a/module/language/tree-il/primitives.scm +++ b/module/language/tree-il/primitives.scm @@ -265,13 +265,15 @@ (module-variable mod name))) (lambda (name) (make-primitive-ref src name ((module-ref src mod name public?) - ;; for the moment, we're disabling primitive resolution for - ;; public refs because resolve-interface can raise errors. - (let ((m (and (not public?) (resolve-module mod - (and m - (and= (hashq-ref *interesting-primitive-vars* - (module-variable m name)) - (lambda (name) (make-primitive-ref src name)) + (and= (and= (resolve-module mod) + (if public? + module-public-interface + identity)) + (lambda (m) + (and= (hashq-ref *interesting-primitive-vars* + (module-variable m name)) + (lambda (name) + (make-primitive-ref src name)) ((call src proc args) (and (primitive-ref? proc) (make-primcall src (primitive-ref-name proc) args))) -- 1.7.9.1
Re: [PATCH] primitive resolution for public refs
Hello, I don't think I've changed the correctness wrt `module-set!' and the like -- the current, private-refs-only optimization already exhibits the problem you describe: | scheme@(guile-user) (define old-car car) | scheme@(guile-user) (module-set! (current-module) I think (current-module) should be (guile) here. Otherwise I think you will change 'car in (guile-user) and not affect (guile). But I have no doubt that this bug is real. | 'car | (lambda (x) (format #t car ~A~% x) (old-car x))) | scheme@(guile-user) ,x (lambda (x) ((@@ (guile) car) x)) | [...] | Disassembly of #procedure 1d23ae0 at current input:4:0 (x): | | 0 (assert-nargs-ee/locals 1) ;; 1 arg, 0 locals | 2 (local-ref 0) ;; `x' | 4 (car) | 5 (return) I'm not certain it's otherwise correct; I just added a workaround for the specific issue mentioned in the comment. Yes, you're right. That just raises the question, though: what's the right thing to do? I realized that you could argue this is part of making a closure, because it closes over the current value of variables in other modules. That doesn't match the behavior of regular closures, though: they just close over storage locations, so they can still be affected by external set!'s. Noah
Re: Build Error in Master
This problem went away for me when I made distclean and built the most recent master. I don't know why that fixed it. 2012/3/9 Ludovic Courtès l...@gnu.org: Hi Noah! Noah Lavine noah.b.lav...@gmail.com skribis: When building the latest git master, I get this error: GUILEC language/tree-il/compile-glil.go Assertion failed: (table-n_items size), function rob_from_rich, file weak-table.c, line 252. I’m seeing that too sometimes when running ./check-guile, with this backtrace (for all threads): --8---cut here---start-8--- Core was generated by `/data/src/guile-2.1/libguile/.libs/guile --debug --no-auto-compile -e main -s /'. Program terminated with signal 6, Aborted. #0 0x7fe5a0a7e85e in raise () from /nix/store/vxycd107wjbhcj720hzkw2px7s7kr724-glibc-2.12.2/lib/libc.so.6 Thread 6 (Thread 8050): #0 0x7fe5a20155ac in pthread_cond_wait@@GLIBC_2.3.2 () from /nix/store/vxycd107wjbhcj720hzkw2px7s7kr724-glibc-2.12.2/lib/libpthread.so.0 No symbol table info available. #1 0x7fe5a25601bb in scm_pthread_cond_wait (cond=0x28abe88, mutex=0x23d4370) at threads.c:1986 res = -512 t = 0x28abe00 #2 0x7fe5a25606d8 in block_self (queue=0x1c4c610, sleep_object=value optimized out, mutex=0x23d4370, waittime=0x0) at threads.c:451 t = 0x28abe00 q_handle = 0x1e173c0 err = value optimized out #3 0x7fe5a2560d43 in fat_mutex_unlock (mutex=0x1c4c690, cond=0x1c4c620, waittime=0x0, relock=1) at threads.c:1627 brk = 0 owner = 0x1cc5ed0 m = 0x23d4370 c = 0x1bb20c0 t = 0x28abe00 err = value optimized out ret = value optimized out #4 0x7fe5a2560ef9 in scm_timed_wait_condition_variable (cv=0x1c4c620, mx=0x1c4c690, t=0x904) at threads.c:1811 waittime = {tv_sec = 47438496, tv_nsec = 140624247786400} waitptr = value optimized out #5 0x7fe5a256c14f in vm_debug_engine (vm=0x1cc5e80, program=0x21801c0, argv=0x0, nargs=0) at vm-i-system.c:911 subr = 0x7fe5a2560e90 scm_timed_wait_condition_variable vp = 0x2890e70 objects = 0x7fe5a27f1cf0 stack_limit = 0x300e000 current_thread = 0x28abe00 registers = {{__jmpbuf = {42647040, 80810306853846533, 23377728, 140735361079616, 2308, 42536560, -84392667918189051, -84345635096707579}, __mask_was_saved = 0, __saved_mask = {__val = {4343230760305832849, 36672864, 36672736, 140624247882320, 36672864, 140624247774144, 140624090466832, 24, 140624246742752, 140624246741380, 21986976, 21986984, 4343230760305832849, 140624247882320, 36672864, 30170752 jump_table_pointer = 0x164b740 jump_table = 0x164b740 #6 0x7fe5a2560158 in really_launch (d=0x7fff81345920) at threads.c:1002 data = 0x7fff81345920 thunk = 0x21801c0 handler = 0x904 t = 0x28abe00 #7 0x7fe5a24eae8a in c_body (d=0x7fe598f58e40) at continuations.c:513 data = 0x7fe598f58e40 #8 0x7fe5a256c427 in vm_debug_engine (vm=0x1cc5e80, program=0x16317e0, argv=0x7fe598f58d90, nargs=4) at vm-i-system.c:973 smob = 0x28abe8c subr = 0x7fe5a2561b60 apply_catch_closure vp = 0x2890e70 objects = 0x22f9508 stack_limit = 0x300e000 current_thread = 0x28abe00 registers = {{__jmpbuf = {23271392, 8081784237061, 23377728, 140624247304752, 140624090467904, 140624090467904, -84392667825914363, -84345635096707579}, __mask_was_saved = 0, __saved_mask = {__val = {0, 6, 128, 524288, 140624244487166, 524288, 140624246741380, 6, 32768, 16, 140624246742752, 140624246741380, 1, 140624090467904, 140624244488924, 30170752 jump_table_pointer = 0x164b740 jump_table = 0x164b740 #9 0x7fe5a24f0623 in scm_call_4 (proc=0x16317e0, arg1=value optimized out, arg2=value optimized out, arg3=value optimized out, arg4=value optimized out) at eval.c:512 args = {0x404, 0x22f9560, 0x22f9540, 0x22f9520} #10 0x7fe5a24eb603 in scm_i_with_continuation_barrier (body=0x7fe5a24eae80 c_body, body_data=0x7fe598f58e40, handler=0x7fe5a24eb230 c_handler, handler_data=0x7fe598f58e40, pre_unwind_handler=value optimized out, pre_unwind_handler_data=value optimized out) at continuations.c:451 stack_item = 22102448 thread = 0x28abe00 old_controot = 0x1cc5ec0 old_contbase = 0x7fe598f58eac result = 0xfe00 #11 0x7fe5a24eb6b5 in scm_c_with_continuation_barrier (func=value optimized out, data=value optimized out) at continuations.c:547 c_data = {func = 0x7fe5a25600b0 really_launch, data = 0x7fff81345920, result = 0x7fff81345920} #12 0x7fe5a255f99a in with_guile_and_parent (base=0x7fe598f58ea0, data=value optimized