Re: Labels as values and threaded-code interpretation
On 2013-06-02 19:44, Walter Bright wrote: The curious question is why it never gets into the newer C and C++ Standards. It's like inline assembly. I think basically every compiler supports it but it's not in the standard. At least there's a compiler for each platform supporting it. -- /Jacob Carlborg
Re: Labels as values and threaded-code interpretation
Am 02.06.2013 06:49, schrieb Walter Bright: On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote: On 01-06-2013 09:59, bearophile wrote: Recently the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax. I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this. To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. I always have fun in public forums making people aware that what they think is C or C++ code is actually compiler defined behaviour. Not much people seem to care to read language standards. :) -- Paulo
Re: Labels as values and threaded-code interpretation
On 6/1/13 9:49 PM, Walter Bright wrote: On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote: On 01-06-2013 09:59, bearophile wrote: Recently the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax. I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this. To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made @safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere. While you're technically correct, every major compiler in the unix world support it with the same syntax. It's entered into defacto standard status.
Re: Labels as values and threaded-code interpretation
Am 02.06.2013 09:25, schrieb Brad Roberts: On 6/1/13 9:49 PM, Walter Bright wrote: On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote: On 01-06-2013 09:59, bearophile wrote: Recently the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax. I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this. To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made @safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere. While you're technically correct, every major compiler in the unix world support it with the same syntax. It's entered into defacto standard status. If your world is only UNIX then yeah, there are still lots of non UNIX os out there if you look outside the desktop computers. Of course, one could always question if they matter at all.
Re: Labels as values and threaded-code interpretation
On 6/2/13 12:33 AM, Paulo Pinto wrote: Am 02.06.2013 09:25, schrieb Brad Roberts: On 6/1/13 9:49 PM, Walter Bright wrote: On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote: On 01-06-2013 09:59, bearophile wrote: Recently the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax. I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this. To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made @safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere. While you're technically correct, every major compiler in the unix world support it with the same syntax. It's entered into defacto standard status. If your world is only UNIX then yeah, there are still lots of non UNIX os out there if you look outside the desktop computers. Of course, one could always question if they matter at all. I agree. I said unix primarily to mean most except for msvc. Many if not most of those non-unix platforms use gcc, which is included in the set of compilers that does support it. The point is, which you didn't change, is that's it's a defacto standard even if not technically in the c or c++ language specs.
Re: Labels as values and threaded-code interpretation
01-Jun-2013 20:13, Timon Gehr пишет: On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote: Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. I'd also like to see this. Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression; -- Dmitry Olshansky
Re: Labels as values and threaded-code interpretation
On Saturday, 1 June 2013 at 05:29:28 UTC, Alex Rønne Petersen wrote: Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. The Erlang folks went through hell just to use this feature; see the 5th Q at: http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions It would also solve the problem with IASM and being unable to address the labels in iasm I noticed the problem when I tried to address the db'ed bytes. http://dpaste.dzfl.pl/36bbd7d3 Commented out //mov dword ptr RAX, meh; ??? illustrates the problem. Jumping to labels in IASM blocks seems to work ( http://dpaste.dzfl.pl/d9e47f70 ). I guess we could have two chickens backed on the same fire ;)
Re: Labels as values and threaded-code interpretation
On 2 June 2013 17:25, Brad Roberts bra...@puremagic.com wrote: On 6/1/13 9:49 PM, Walter Bright wrote: On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote: On 01-06-2013 09:59, bearophile wrote: Recently the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax. I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this. To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made @safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere. While you're technically correct, every major compiler in the unix world support it with the same syntax. It's entered into defacto standard status. MSVC doesn't support it, which is really annoying.
Re: Labels as values and threaded-code interpretation
On 02-06-2013 08:44, Paulo Pinto wrote: Am 02.06.2013 06:49, schrieb Walter Bright: On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote: On 01-06-2013 09:59, bearophile wrote: Recently the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax. I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this. To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. I always have fun in public forums making people aware that what they think is C or C++ code is actually compiler defined behaviour. Not much people seem to care to read language standards. :) -- Paulo I am perfectly aware that it's a GNU C extension. -- Alex Rønne Petersen a...@alexrp.com / a...@lycus.org http://alexrp.com / http://lycus.org
Re: Labels as values and threaded-code interpretation
On 02-06-2013 10:52, Dmitry Olshansky wrote: 01-Jun-2013 20:13, Timon Gehr пишет: On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote: Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. I'd also like to see this. Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression; I'm not sure that can support threaded-code interpretation. -- Alex Rønne Petersen a...@alexrp.com / a...@lycus.org http://alexrp.com / http://lycus.org
Re: Labels as values and threaded-code interpretation
02-Jun-2013 20:48, Alex Rønne Petersen пишет: On 02-06-2013 10:52, Dmitry Olshansky wrote: 01-Jun-2013 20:13, Timon Gehr пишет: On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote: Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. I'd also like to see this. Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression; I'm not sure that can support threaded-code interpretation. Why not: alias OpCode = function void(); OpCode[] opcodes = [ opcode_1, ... ]; int pc; ... void opcode_1() { ... //pick operands do whatever pc = pc + num_operands; //skip over arguments OpCode next = cast(OpCode)bytecode[pc]; goto next(); //this is baked-in threaded dispatch } void opcode_2(){ ... } //say bytecode contains operands and addresses of fucntions void execute(size_t[] bytecode, int pc) { OpCode start = cast(OpCode)bytecode[pc]; pc++; goto start(); } One can get away without casting if data is in a separate array. Then this solution is perfectly safe in this limited form. Call would only point to a function hence no problem with jumping who knows where and less problems for optimizer. -- Dmitry Olshansky
Re: Labels as values and threaded-code interpretation
On 6/2/2013 12:49 AM, Brad Roberts wrote: Many if not most of those non-unix platforms use gcc, which is included in the set of compilers that does support it. The point is, which you didn't change, is that's it's a defacto standard even if not technically in the c or c++ language specs. The curious question is why it never gets into the newer C and C++ Standards.
Re: Labels as values and threaded-code interpretation
On 02-06-2013 19:43, Dmitry Olshansky wrote: 02-Jun-2013 20:48, Alex Rønne Petersen пишет: On 02-06-2013 10:52, Dmitry Olshansky wrote: 01-Jun-2013 20:13, Timon Gehr пишет: On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote: Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. I'd also like to see this. Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression; I'm not sure that can support threaded-code interpretation. Why not: alias OpCode = function void(); OpCode[] opcodes = [ opcode_1, ... ]; int pc; ... void opcode_1() { ... //pick operands do whatever pc = pc + num_operands; //skip over arguments OpCode next = cast(OpCode)bytecode[pc]; goto next(); //this is baked-in threaded dispatch } void opcode_2(){ ... } //say bytecode contains operands and addresses of fucntions void execute(size_t[] bytecode, int pc) { OpCode start = cast(OpCode)bytecode[pc]; pc++; goto start(); } One can get away without casting if data is in a separate array. Then this solution is perfectly safe in this limited form. Call would only point to a function hence no problem with jumping who knows where and less problems for optimizer. The problem here is assuming the interpreter state can be global. Once you make it non-global (and thus have to pass it in the goto start(...) call) you get all the overhead of a regular function call. -- Alex Rønne Petersen a...@alexrp.com / a...@lycus.org http://alexrp.com / http://lycus.org
Re: Labels as values and threaded-code interpretation
On 02-06-2013 19:44, Walter Bright wrote: On 6/2/2013 12:49 AM, Brad Roberts wrote: Many if not most of those non-unix platforms use gcc, which is included in the set of compilers that does support it. The point is, which you didn't change, is that's it's a defacto standard even if not technically in the c or c++ language specs. The curious question is why it never gets into the newer C and C++ Standards. That one's easy: It makes 'too many' assumptions about the capabilities of the target machine. Sort of like assuming signed integers overflow. In practice, though, any relevant target can support computed gotos. -- Alex Rønne Petersen a...@alexrp.com / a...@lycus.org http://alexrp.com / http://lycus.org
Re: Labels as values and threaded-code interpretation
On 6/2/13 10:44 AM, Walter Bright wrote: On 6/2/2013 12:49 AM, Brad Roberts wrote: Many if not most of those non-unix platforms use gcc, which is included in the set of compilers that does support it. The point is, which you didn't change, is that's it's a defacto standard even if not technically in the c or c++ language specs. The curious question is why it never gets into the newer C and C++ Standards. I can only surmise since I've never been a part of those processes, but: 1) since the support is already there for the people that want it, why go through the hassle? 2) going through the hassle just to get it more broadly supported is a lot of hassle. 3) people are generally fundamentally allergic to hassle. Which also lines up with why people are asking for it to be added to d, because it's not already available, so it's worth the hassle.
Re: Labels as values and threaded-code interpretation
02-Jun-2013 21:49, Alex Rønne Petersen пишет: On 02-06-2013 19:43, Dmitry Olshansky wrote: 02-Jun-2013 20:48, Alex Rønne Petersen пишет: On 02-06-2013 10:52, Dmitry Olshansky wrote: 01-Jun-2013 20:13, Timon Gehr пишет: On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote: Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. I'd also like to see this. Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression; I'm not sure that can support threaded-code interpretation. Why not: alias OpCode = function void(); OpCode[] opcodes = [ opcode_1, ... ]; int pc; ... void opcode_1() { ... //pick operands do whatever pc = pc + num_operands; //skip over arguments OpCode next = cast(OpCode)bytecode[pc]; goto next(); //this is baked-in threaded dispatch } void opcode_2(){ ... } //say bytecode contains operands and addresses of fucntions void execute(size_t[] bytecode, int pc) { OpCode start = cast(OpCode)bytecode[pc]; pc++; goto start(); } One can get away without casting if data is in a separate array. Then this solution is perfectly safe in this limited form. Call would only point to a function hence no problem with jumping who knows where and less problems for optimizer. The problem here is assuming the interpreter state can be global. Once you make it non-global (and thus have to pass it in the goto start(...) call) you get all the overhead of a regular function call. One pointer to a struct MyInterpreterState. Think of it as 'this' pointer :) Anyhow cramming the whole interpreter in one huge function too has downsides (and hopelessly inefficent register allocation is one). And you still access most locals via stack pointer. And then there got to be something beyond locals - 'context', that is still passed from one bytecode execution to another. BTW Globals are actually quite expansive to access anyway, this was a mock example. -- Dmitry Olshansky
Re: Labels as values and threaded-code interpretation
On 02-06-2013 19:58, Dmitry Olshansky wrote: 02-Jun-2013 21:49, Alex Rønne Petersen пишет: On 02-06-2013 19:43, Dmitry Olshansky wrote: 02-Jun-2013 20:48, Alex Rønne Petersen пишет: On 02-06-2013 10:52, Dmitry Olshansky wrote: 01-Jun-2013 20:13, Timon Gehr пишет: On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote: Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. I'd also like to see this. Same here. Though I believe a way to force tail-call can support the same use case also helping functional programming. Say: goto Call-Expression; //forced tail call instead of: return Call-Expression; I'm not sure that can support threaded-code interpretation. Why not: alias OpCode = function void(); OpCode[] opcodes = [ opcode_1, ... ]; int pc; ... void opcode_1() { ... //pick operands do whatever pc = pc + num_operands; //skip over arguments OpCode next = cast(OpCode)bytecode[pc]; goto next(); //this is baked-in threaded dispatch } void opcode_2(){ ... } //say bytecode contains operands and addresses of fucntions void execute(size_t[] bytecode, int pc) { OpCode start = cast(OpCode)bytecode[pc]; pc++; goto start(); } One can get away without casting if data is in a separate array. Then this solution is perfectly safe in this limited form. Call would only point to a function hence no problem with jumping who knows where and less problems for optimizer. The problem here is assuming the interpreter state can be global. Once you make it non-global (and thus have to pass it in the goto start(...) call) you get all the overhead of a regular function call. One pointer to a struct MyInterpreterState. Think of it as 'this' pointer :) That's one load and store for every single step in the interpreter. Sounds harmless, but every cycle matters when every goto start(...) amounts to a single instruction in the program you're running. Anyhow cramming the whole interpreter in one huge function too has downsides (and hopelessly inefficent register allocation is one). I'd frankly be surprised if that was the case. But without any investigation, it's just word against word. And you still access most locals via stack pointer. And then there got to be something beyond locals - 'context', that is still passed from one bytecode execution to another. It depends on what you mean by context. Can you elaborate? BTW Globals are actually quite expansive to access anyway, this was a mock example. Sure. Either way, evidence suggests that this style of interpreter has performance advantages in practice. -- Alex Rønne Petersen a...@alexrp.com / a...@lycus.org http://alexrp.com / http://lycus.org
Re: Labels as values and threaded-code interpretation
On 6/2/2013 10:58 AM, Dmitry Olshansky wrote: Anyhow cramming the whole interpreter in one huge function too has downsides (and hopelessly inefficent register allocation is one). This should not be the case with register allocation using live ranges. You can also help with this by organizing declarations so the variables have the shortest scope.
Re: Labels as values and threaded-code interpretation
02-Jun-2013 22:54, Walter Bright пишет: On 6/2/2013 10:58 AM, Dmitry Olshansky wrote: Anyhow cramming the whole interpreter in one huge function too has downsides (and hopelessly inefficent register allocation is one). This should not be the case with register allocation using live ranges. The problem is that if every goto *get_me_new_opcode compile may have no idea of where it will through it with computed labels. Hence some conservativeness of what registers contain and reloads and in effect all labels end up as isolated functions. Plus a big stack frame is liability. And then there is a fair share of obstacles in making the kind of threaded interpreter code optimized properly, this post I find interesting: http://article.gmane.org/gmane.comp.lang.lua.general/75426 You can also help with this by organizing declarations so the variables have the shortest scope. Then splitting opcodes as functions separates their timespan even more. -- Dmitry Olshansky
Re: Labels as values and threaded-code interpretation
02-Jun-2013 22:25, Alex Rønne Petersen пишет: On 02-06-2013 19:58, Dmitry Olshansky wrote: The problem here is assuming the interpreter state can be global. Once you make it non-global (and thus have to pass it in the goto start(...) call) you get all the overhead of a regular function call. One pointer to a struct MyInterpreterState. Think of it as 'this' pointer :) That's one load and store for every single step in the interpreter. Sounds harmless, but every cycle matters when every goto start(...) amounts to a single instruction in the program you're running. Anyhow cramming the whole interpreter in one huge function too has downsides (and hopelessly inefficent register allocation is one). I'd frankly be surprised if that was the case. But without any investigation, it's just word against word. See my reply to Walter. Again I personally have not done any measurements but what that post by LuaJIt author makes a lot of sense to me. And you still access most locals via stack pointer. And then there got to be something beyond locals - 'context', that is still passed from one bytecode execution to another. It depends on what you mean by context. Can you elaborate? Context is what a virtual thread represents in the global world. Any outside hooks into it like argv in C's main are there, function tables (=extensions) and whatnot. -- Dmitry Olshansky
Re: Labels as values and threaded-code interpretation
02-Jun-2013 22:25, Alex Rønne Petersen пишет: On 02-06-2013 19:58, Dmitry Olshansky wrote: One pointer to a struct MyInterpreterState. Think of it as 'this' pointer :) That's one load and store for every single step in the interpreter. Sounds harmless, but every cycle matters when every goto start(...) amounts to a single instruction in the program you're running. If the instruction set is RISC-y then you are in trouble anyway (writing interpreter not in ASM). One way out is fusing multiple simple instructions into macro-instructions and interpreting those. Either way, evidence suggests that this style of interpreter has performance advantages in practice. Agreed. Though this day I'd go for what I call bastardized JIT. In essense just produce a chunk of call XYZ instructions, comnditionals would have to have a test + jmp. And then there is 'ret' at the end of this chunk ~4 instructions to have to deal with. What you get is branch prediction done on the hardware level and no instruction counter to worry about. -- Dmitry Olshansky
Re: Labels as values and threaded-code interpretation
On 6/2/2013 1:10 PM, Dmitry Olshansky wrote: The problem is that if every goto *get_me_new_opcode compile may have no idea of where it will through it with computed labels. Hence some conservativeness of what registers contain and reloads and in effect all labels end up as isolated functions. Plus a big stack frame is liability. And then there is a fair share of obstacles in making the kind of threaded interpreter code optimized properly, this post I find interesting: http://article.gmane.org/gmane.comp.lang.lua.general/75426 Thanks for the interesting article.
Re: Labels as values and threaded-code interpretation
On Saturday, 1 June 2013 at 17:44:02 UTC, Rob T wrote: I wonder if this sort of thing would help solve another implementation problem, which is implementing very fast coroutines without the overhead seen with similar features such as fibers. Here's some evidence that the answer to your question is yes. http://www.cs.arizona.edu/icon/jcon/gde97.pdf
Re: Labels as values and threaded-code interpretation
Alex Rønne Petersen: I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. This was discussed more than once, and I think it's a valuable improvement for certain kinds of low level D programming. Walter says that he has not added this feature to D because it's useful only to write interpreters and the like. I have found it useful in GCC to also write very fast finite state machines to analyse genomic strings. But even if Walter is right, writing interpreters (and emulators) is an important purpose for a system language as D. You can't write them efficient in scripting languages, and even Haskell/F# are not good at all to write them. Languages like C/D/C++ (and few others, as later probably Rust) are the only few natural languages to write them. Recently the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax. Implementing it should be trivial as both LLVM and GCC support taking the address of a block. I'm sure the DMD back end could be extended to support it too. Even if DMD back-end will never implement it, I think it's important to define it formally and officially in the D language and add the relative syntax to the front-end (plus a standard version identifier that allows to write programs that have both a routine that uses computed gotos and one that doesn't use them). This has the big advantage that all future compilers that will want to implement it will use the same nice standard syntax. If D doesn't define this syntax, then maybe GDC will add it, and maybe later LDC will add it, and later maybe other future compilers will add it, and we can't be sure they will share the same syntax. Another advantage of having this syntax in D, is that even if a compiler back-end doesn't support computed gotos, the program compiles without syntax errors when you use conditional compilation to disable the piece of code that uses computed gotos. Bye, bearophile
Re: Labels as values and threaded-code interpretation
Alex Rønne Petersen: final switch (insn.op) { case imm: lbl = handle_imm; break; case add: lbl = handle_add; break; case sub: lbl = handle_sub; break; // ... case ret: lbl = handle_ret; break; Regarding the syntax, why do you use ? Isn't a single enough? case imm: lbl = handle_imm; break; If such gotos become a natural part of the D syntax then it's not necessary to copy the GNU C syntax. Bye, bearophile
Re: Labels as values and threaded-code interpretation
On Saturday, 1 June 2013 at 11:50:23 UTC, bearophile wrote: As example the very small but fast virtual machine written in GNU C++ from the International Contest on Functional Programming 2006: http://codepad.org/iibBeWKw It's faster than the same code without computed gotos. Bye, bearophile Would be cool if there was a platform independent way in D to construct a sequence of call instructions in memory. Then it would be possible to write a JIT compiler in the exact same style as that example. It would be relatively easy to add to phobos although you'd have to be careful about DEP. Hmm, perhaps I will try writing something like this...
Re: Labels as values and threaded-code interpretation
As example the very small but fast virtual machine written in GNU C++ from the International Contest on Functional Programming 2006: http://codepad.org/iibBeWKw It's faster than the same code without computed gotos. Bye, bearophile
Re: Labels as values and threaded-code interpretation
On 06/01/2013 07:29 AM, Alex Rønne Petersen wrote: Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. I'd also like to see this. The Erlang folks went through hell just to use this feature; see the 5th Q at: http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions The idea is to be able to write code like this: import std.algorithm; enum Op : ubyte { imm, add, sub, // ... ret, } final class Insn { Op op; size_t[] args; void* lbl; Insn next; } final class State { Insn pc; size_t[64] regs; } size_t interp(Insn[] code) { // Set up the instruction stream with label addresses // the first time that it is executed. Label addresses // are stable, so we only do this once. foreach (insn; code.filter!(x = !x.lbl)()) { void* lbl; with (Op) { final switch (insn.op) { case imm: lbl = handle_imm; break; case add: lbl = handle_add; break; case sub: lbl = handle_sub; break; // ... case ret: lbl = handle_ret; break; } } insn.lbl = lbl; } ... (This must be supported at compile time!)
Re: Labels as values and threaded-code interpretation
On 06/01/2013 11:43 AM, bearophile wrote: Alex Rønne Petersen: final switch (insn.op) { case imm: lbl = handle_imm; break; case add: lbl = handle_add; break; case sub: lbl = handle_sub; break; // ... case ret: lbl = handle_ret; break; Regarding the syntax, why do you use ? Isn't a single enough? case imm: lbl = handle_imm; break; If such gotos become a natural part of the D syntax then it's not necessary to copy the GNU C syntax. Bye, bearophile D (like C) uses a different namespace for labels and symbols that are not labels. This compiles today: void main(){ int foo; foo: auto b = foo; }
Re: Labels as values and threaded-code interpretation
On Saturday, 1 June 2013 at 16:13:18 UTC, Timon Gehr wrote: I'd also like to see this. Me too. Last time this came up I said no since there's some inline asm hacks you can do but turns out those hacks suck in capability and appearance compared to just having this feature.
Re: Labels as values and threaded-code interpretation
On Saturday, 1 June 2013 at 16:23:11 UTC, Adam D. Ruppe wrote: On Saturday, 1 June 2013 at 16:13:18 UTC, Timon Gehr wrote: I'd also like to see this. Me too. Last time this came up I said no since there's some inline asm hacks you can do but turns out those hacks suck in capability and appearance compared to just having this feature. I wonder if this sort of thing would help solve another implementation problem, which is implementing very fast coroutines without the overhead seen with similar features such as fibers. Some discussion here: http://forum.dlang.org/thread/pcexsqbwefjueyyly...@forum.dlang.org --rt
Re: Labels as values and threaded-code interpretation
On 01-06-2013 15:17, Diggory wrote: On Saturday, 1 June 2013 at 11:50:23 UTC, bearophile wrote: As example the very small but fast virtual machine written in GNU C++ from the International Contest on Functional Programming 2006: http://codepad.org/iibBeWKw It's faster than the same code without computed gotos. Bye, bearophile Would be cool if there was a platform independent way in D to construct a sequence of call instructions in memory. Then it would be possible to write a JIT compiler in the exact same style as that example. At that point, you just want a simple in-memory assembler. Think asmjit. It would be relatively easy to add to phobos although you'd have to be careful about DEP. Hmm, perhaps I will try writing something like this... Probably a bit too domain-specific (not to say it isn't useful). -- Alex Rønne Petersen a...@alexrp.com / a...@lycus.org http://alexrp.com / http://lycus.org
Re: Labels as values and threaded-code interpretation
On 01-06-2013 07:41, Manu wrote: GCC has a non-standard extension to do this, I've used it to get great performance wins writing emulators. I love this feature, love to see it in D! Yes, it's basically essential for high-perf interpreters. It's a feature that a systems language like D must have. On 1 Jun 2013 15:30, Alex Rønne Petersen a...@lycus.org mailto:a...@lycus.org wrote: Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/__gcc/Labels-as-Values.html http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. The Erlang folks went through hell just to use this feature; see the 5th Q at: http://www.erlang.org/doc/__installation_guide/INSTALL-__WIN32.html#Frequently-Asked-__Questions http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions The idea is to be able to write code like this: import std.algorithm; enum Op : ubyte { imm, add, sub, // ... ret, } final class Insn { Op op; size_t[] args; void* lbl; Insn next; } final class State { Insn pc; size_t[64] regs; } size_t interp(Insn[] code) { // Set up the instruction stream with label addresses // the first time that it is executed. Label addresses // are stable, so we only do this once. foreach (insn; code.filter!(x = !x.lbl)()) { void* lbl; with (Op) { final switch (insn.op) { case imm: lbl = handle_imm; break; case add: lbl = handle_add; break; case sub: lbl = handle_sub; break; // ... case ret: lbl = handle_ret; break; } } insn.lbl = lbl; } // Start interpreting the entry instruction. auto state = new State; state.pc = code[0]; goto *state.pc.lbl; // Interpreter logic follows... // The whole point is to avoid unnecessary function // calls, so we use a mixin template for this stuff. enum advance = state.pc = state.pc.next; ~ goto *state.pc.lbl;; handle_imm: { state.regs[state.pc.args[0]] = state.pc.args[1]; mixin(advance); } handle_add: { state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] + state.regs[state.pc.args[2]]; mixin(advance); } handle_sub: { state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] - state.regs[state.pc.args[2]]; mixin(advance); } // ... handle_ret: { return state.regs[state.pc.args[0]]; } assert(false); } Notice that there are no function calls going on. Just plain jumps all the way through. This is a technique that many real world interpreters use to get significant speedups, and I for one think we desperately need it. Implementing it should be trivial as both LLVM and GCC support taking the address of a block. I'm sure the DMD back end could be extended to support it too. -- Alex Rønne Petersen a...@alexrp.com mailto:a...@alexrp.com / a...@lycus.org mailto:a...@lycus.org http://alexrp.com / http://lycus.org -- Alex Rønne Petersen a...@alexrp.com / a...@lycus.org http://alexrp.com / http://lycus.org
Re: Labels as values and threaded-code interpretation
On 01-06-2013 11:43, bearophile wrote: Alex Rønne Petersen: final switch (insn.op) { case imm: lbl = handle_imm; break; case add: lbl = handle_add; break; case sub: lbl = handle_sub; break; // ... case ret: lbl = handle_ret; break; Regarding the syntax, why do you use ? Isn't a single enough? case imm: lbl = handle_imm; break; If such gotos become a natural part of the D syntax then it's not necessary to copy the GNU C syntax. Bye, bearophile I just used the GNU C syntax because I was familiar with it. I don't particularly care how it ends up looking in D. But Timon makes a good point about namespaces. -- Alex Rønne Petersen a...@alexrp.com / a...@lycus.org http://alexrp.com / http://lycus.org
Re: Labels as values and threaded-code interpretation
On 01-06-2013 09:59, bearophile wrote: Alex Rønne Petersen: I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. This was discussed more than once, and I think it's a valuable improvement for certain kinds of low level D programming. Walter says that he has not added this feature to D because it's useful only to write interpreters and the like. I have found it useful in GCC to also write very fast finite state machines to analyse genomic strings. But even if Walter is right, writing interpreters (and emulators) is an important purpose for a system language as D. You can't write them efficient in scripting languages, and even Haskell/F# are not good at all to write them. Languages like C/D/C++ (and few others, as later probably Rust) are the only few natural languages to write them. Recently the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax. I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this. Implementing it should be trivial as both LLVM and GCC support taking the address of a block. I'm sure the DMD back end could be extended to support it too. Even if DMD back-end will never implement it, I think it's important to define it formally and officially in the D language and add the relative syntax to the front-end (plus a standard version identifier that allows to write programs that have both a routine that uses computed gotos and one that doesn't use them). This has the big advantage that all future compilers that will want to implement it will use the same nice standard syntax. If D doesn't define this syntax, then maybe GDC will add it, and maybe later LDC will add it, and later maybe other future compilers will add it, and we can't be sure they will share the same syntax. Another advantage of having this syntax in D, is that even if a compiler back-end doesn't support computed gotos, the program compiles without syntax errors when you use conditional compilation to disable the piece of code that uses computed gotos. Yes, good points. Bye, bearophile -- Alex Rønne Petersen a...@alexrp.com / a...@lycus.org http://alexrp.com / http://lycus.org
Re: Labels as values and threaded-code interpretation
Timon Gehr: D (like C) uses a different namespace for labels and symbols that are not labels. This compiles today: void main(){ int foo; foo: auto b = foo; } On a related topic I wrote this: http://d.puremagic.com/issues/show_bug.cgi?id=4902 Bye, bearophile
Re: Labels as values and threaded-code interpretation
On 6/1/13, Timon Gehr timon.g...@gmx.ch wrote: D (like C) uses a different namespace for labels and symbols that are not labels. Perhaps for experimenting purposes (before resorting to language changes), a trait could be introduced. E.g.: void main(){ int foo; foo: auto b = __traits(gotoAddr, foo); } And if it's successful, we add language support.
Re: Labels as values and threaded-code interpretation
On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote: On 01-06-2013 09:59, bearophile wrote: Recently the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax. I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this. To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. Also, such a construct could not be made @safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere.
Re: Labels as values and threaded-code interpretation
On 02-06-2013 05:18, Andrej Mitrovic wrote: On 6/1/13, Timon Gehr timon.g...@gmx.ch wrote: D (like C) uses a different namespace for labels and symbols that are not labels. Perhaps for experimenting purposes (before resorting to language changes), a trait could be introduced. E.g.: void main(){ int foo; foo: auto b = __traits(gotoAddr, foo); } And if it's successful, we add language support. You need a way to jump to an arbitrary address too. -- Alex Rønne Petersen a...@alexrp.com / a...@lycus.org http://alexrp.com / http://lycus.org
Re: Labels as values and threaded-code interpretation
On 02-06-2013 06:49, Walter Bright wrote: On 6/1/2013 7:35 PM, Alex Rønne Petersen wrote: On 01-06-2013 09:59, bearophile wrote: Recently the Python C interpreter was modified and speed up thanks to this non-standard feature. CPython source code has two versions, one with computed gotos and one without, to compile it even if your C compiler doesn't support them or their GNU-C syntax. I don't think there's any question as to the usefulness (and essentialness) of this feature. I'm very close to just writing most of the interpreter in C over a triviality like this. To be pedantic, C and C++ don't have that feature. Some compilers add it as an extension. I know, I just don't always remember to type out GNU C instead of C. Also, such a construct could not be made @safe. The trouble is you could pass those addresses anywhere, and goto them from anywhere. I don't particularly care about its safety (its insanely unsafe). It's all about performance with a feature like this. -- Alex Rønne Petersen a...@alexrp.com / a...@lycus.org http://alexrp.com / http://lycus.org
Labels as values and threaded-code interpretation
Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. The Erlang folks went through hell just to use this feature; see the 5th Q at: http://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions The idea is to be able to write code like this: import std.algorithm; enum Op : ubyte { imm, add, sub, // ... ret, } final class Insn { Op op; size_t[] args; void* lbl; Insn next; } final class State { Insn pc; size_t[64] regs; } size_t interp(Insn[] code) { // Set up the instruction stream with label addresses // the first time that it is executed. Label addresses // are stable, so we only do this once. foreach (insn; code.filter!(x = !x.lbl)()) { void* lbl; with (Op) { final switch (insn.op) { case imm: lbl = handle_imm; break; case add: lbl = handle_add; break; case sub: lbl = handle_sub; break; // ... case ret: lbl = handle_ret; break; } } insn.lbl = lbl; } // Start interpreting the entry instruction. auto state = new State; state.pc = code[0]; goto *state.pc.lbl; // Interpreter logic follows... // The whole point is to avoid unnecessary function // calls, so we use a mixin template for this stuff. enum advance = state.pc = state.pc.next; ~ goto *state.pc.lbl;; handle_imm: { state.regs[state.pc.args[0]] = state.pc.args[1]; mixin(advance); } handle_add: { state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] + state.regs[state.pc.args[2]]; mixin(advance); } handle_sub: { state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] - state.regs[state.pc.args[2]]; mixin(advance); } // ... handle_ret: { return state.regs[state.pc.args[0]]; } assert(false); } Notice that there are no function calls going on. Just plain jumps all the way through. This is a technique that many real world interpreters use to get significant speedups, and I for one think we desperately need it. Implementing it should be trivial as both LLVM and GCC support taking the address of a block. I'm sure the DMD back end could be extended to support it too. -- Alex Rønne Petersen a...@alexrp.com / a...@lycus.org http://alexrp.com / http://lycus.org
Re: Labels as values and threaded-code interpretation
GCC has a non-standard extension to do this, I've used it to get great performance wins writing emulators. I love this feature, love to see it in D! On 1 Jun 2013 15:30, Alex Rønne Petersen a...@lycus.org wrote: Hi, I'm sure this has been brought up before, but I feel I need to bring it up again (because I'm going to be writing a threaded-code interpreter): http://gcc.gnu.org/onlinedocs/**gcc/Labels-as-Values.htmlhttp://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html This is an incredibly important extension. The final switch statement is not a replacement because it doesn't allow the programmer to store a label address directly into a code stream, which is what's essential to write a threaded-code interpreter. The Erlang folks went through hell just to use this feature; see the 5th Q at: http://www.erlang.org/doc/**installation_guide/INSTALL-** WIN32.html#Frequently-Asked-**Questionshttp://www.erlang.org/doc/installation_guide/INSTALL-WIN32.html#Frequently-Asked-Questions The idea is to be able to write code like this: import std.algorithm; enum Op : ubyte { imm, add, sub, // ... ret, } final class Insn { Op op; size_t[] args; void* lbl; Insn next; } final class State { Insn pc; size_t[64] regs; } size_t interp(Insn[] code) { // Set up the instruction stream with label addresses // the first time that it is executed. Label addresses // are stable, so we only do this once. foreach (insn; code.filter!(x = !x.lbl)()) { void* lbl; with (Op) { final switch (insn.op) { case imm: lbl = handle_imm; break; case add: lbl = handle_add; break; case sub: lbl = handle_sub; break; // ... case ret: lbl = handle_ret; break; } } insn.lbl = lbl; } // Start interpreting the entry instruction. auto state = new State; state.pc = code[0]; goto *state.pc.lbl; // Interpreter logic follows... // The whole point is to avoid unnecessary function // calls, so we use a mixin template for this stuff. enum advance = state.pc = state.pc.next; ~ goto *state.pc.lbl;; handle_imm: { state.regs[state.pc.args[0]] = state.pc.args[1]; mixin(advance); } handle_add: { state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] + state.regs[state.pc.args[2]]; mixin(advance); } handle_sub: { state.regs[state.pc.args[0]] = state.regs[state.pc.args[1]] - state.regs[state.pc.args[2]]; mixin(advance); } // ... handle_ret: { return state.regs[state.pc.args[0]]; } assert(false); } Notice that there are no function calls going on. Just plain jumps all the way through. This is a technique that many real world interpreters use to get significant speedups, and I for one think we desperately need it. Implementing it should be trivial as both LLVM and GCC support taking the address of a block. I'm sure the DMD back end could be extended to support it too. -- Alex Rønne Petersen a...@alexrp.com / a...@lycus.org http://alexrp.com / http://lycus.org