Re: inline mania
Dan Sugalski [EMAIL PROTECTED] wrote: Non-inline functions have their place in reducing code size and easing debugging. I just want an i_foo for every foo that callers will have the option of using. Before we make any promises to do all that extra work can we measure (for various architectures) the cost of a real call vs inline. I want proof that inline makes X% difference. I'm not going to prove that. A normal C function call involves several instructions and a jump most likely across page boundaries. But if that function is already in cache from another use, you win. Assume, for a minute, you've got a 700MHz system with a 100MHz 128-bit data bus. If your code is inlined and works out to 256 bytes, that's 16 fetches on the main bus. That costs 112 cycles. On the other hand, if your Dan, you are completely missing my point. Okay, fine, non-inline may be a performance win in some cases. Inline may be a win in others. I am not proposing we mandate inlining in any case, I am proposing the exact opposite: that we let the caller decide in every case. -- John Tobey, late nite hacker [EMAIL PROTECTED] \\\ /// ]]] With enough bugs, all eyes are shallow. [[[ /// \\\
RE: inline mania
Alan Burlison wrote: John Tobey wrote: 1 November 2000 - Perl6 alpha in C++ that uses classes derived from PerlInterpreter and SV everywhere in place of these types, a la Pickle, but with inline methods that use Perl 5's internal API. I think there is an undiscussed assumption about the implementation language in there somewhere... I think you may have missed the context of the message. John was talking about creating his Alpha using various existing projects that had already been done in C++. [ ... snip ... ] We've been down that path already - Topaz. With all due respect this is supposed to be a community rewrite. Your proposal doesn't seem to be along those lines. With all due respect, I think you may be taking this out of context. I don't believe John's intent was to hijack the process. He was outling a theoretical schedule that could be used to provide a working Perl5 - Perl6 migration path. Regards, -Brent
Nick's performance theory - was inline mania
John Tobey [EMAIL PROTECTED] writes: Maybe not for void functions with no args, tail-called and with no prefix, but in more typically cases yes it can be different the "function-ness" of i_foo applies constaints on where args and "result" are which optimizer _may_ not be able to unravel. May not be able because of what the Standard says, or because of suboptimal optimization? suboptimal optimization, (i.e. lack of knowledge about rest of program at time of expansion) - note that a suitably "optimal" optimizer _could_ turn 100,000 #define-d lines back into "local real functions". But is usually much easier add entropy - so start with its the same function - call it, and let compiler decide which ones to expand. GCC won't unless you go -O3 or above. This is why many people (me included) stop at -O2 for most programs. Me too - because I _fundamentally_ believe inlining is nearly always sub-optimal for real programs. But -O3 (or -finline-functions) is there for the folk that want to believe the opposite. And there is -Dinline -D__inline__ for the inline case. What there isn't though is -fhash_define-as-inline or -fno-macros so at very least lets avoid _that_ path. Non-inline functions have their place in reducing code size and easing debugging. I just want an i_foo for every foo that callers will have the option of using. Before we make any promises to do all that extra work can we measure (for various architectures) the cost of a real call vs inline. I want proof that inline makes X% difference. I'm not going to prove that. A normal C function call involves several instructions and a jump most likely across page boundaries. I have said this before but the gist of the Nick-theory is: Page boundaries are a don't care unless there is a page miss. Page misses are so costly that everything else can be ignored, but for sane programs they should only be incured at "startup". (Reducing code size e.g. no inline only helps here - less pages to load.) It is cache that matters. Modern processors (can) execute several instructions per-cycle. In contrast a cache miss to 100MHz SDRAM costs a 500MHz processor more than 5-cycles (say up to 10 instructions for 2-way super-scalar) per word missed. I used to think that this was a "RISC Processor only" argument. But is seems (no hard numbers yet) that Pentium at least follows same pattern. If someone else wants to prove this, great. I just don't think it's that much trouble. (mostly psychological - what will people think if they see that all our code is in headers and all our C files are autogenerated?) We can unlink the .c files once we have compiled them ;-) -- Nick Ing-Simmons
Re: inline mania
Brent Fulgham wrote: I think there is an undiscussed assumption about the implementation language in there somewhere... I think you may have missed the context of the message. John was talking about creating his Alpha using various existing projects that had already been done in C++. Why is he bothering? A year to produce a prototype doesn't seem like a useful way to expend effort on something that isn't actually perl6. We've been down that path already - Topaz. With all due respect this is supposed to be a community rewrite. Your proposal doesn't seem to be along those lines. With all due respect, I think you may be taking this out of context. I don't believe John's intent was to hijack the process. He was outling a theoretical schedule that could be used to provide a working Perl5 - Perl6 migration path. I'm not saying it was. However I don't see how the proposal would aid the migration - after all what he is writing will be neither perl5 nor perl6. Alan Burlison
Re: Nick's performance theory - was inline mania
Nick Ing-Simmons [EMAIL PROTECTED] wrote: But is usually much easier add entropy - so start with its the same function - call it, and let compiler decide which ones to expand. You'll get no argument on that point. Please stop suggesting that I want to take the power of decision away from programmers *OR* compilers. If someone else wants to prove this, great. I just don't think it's that much trouble. (mostly psychological - what will people think if they see that all our code is in headers and all our C files are autogenerated?) We can unlink the .c files once we have compiled them ;-) Nope. Messes up source debuggers.
Re: inline mania
At 05:55 PM 8/1/00 -0400, John Tobey wrote: Dan Sugalski [EMAIL PROTECTED] wrote: Non-inline functions have their place in reducing code size and easing debugging. I just want an i_foo for every foo that callers will have the option of using. Before we make any promises to do all that extra work can we measure (for various architectures) the cost of a real call vs inline. I want proof that inline makes X% difference. I'm not going to prove that. A normal C function call involves several instructions and a jump most likely across page boundaries. But if that function is already in cache from another use, you win. Assume, for a minute, you've got a 700MHz system with a 100MHz 128-bit data bus. If your code is inlined and works out to 256 bytes, that's 16 fetches on the main bus. That costs 112 cycles. On the other hand, if your Dan, you are completely missing my point. Okay, fine, non-inline may be a performance win in some cases. Inline may be a win in others. I am not proposing we mandate inlining in any case, I am proposing the exact opposite: that we let the caller decide in every case. Having thought about it a bunch more (because of this) I'm proposing we let the compiler decide. The caller doesn't know enough to make that decision. *Especially* if (when) we allow for on-the-fly changing of variable access subsystems. We'll be marking a *lot* of functions as inline, and they'll likely get used often and frequently. (If all the SV, HV, and AV macros become functions, which they should, it'll be used a lot) If we go the PI route, the generated files should either have inlined functions or calls to plain functions, and the PI code generator can work that out. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: inline mania
Dan Sugalski [EMAIL PROTECTED] wrote: At 05:55 PM 8/1/00 -0400, John Tobey wrote: Dan, you are completely missing my point. Okay, fine, non-inline may be a performance win in some cases. Inline may be a win in others. I am not proposing we mandate inlining in any case, I am proposing the exact opposite: that we let the caller decide in every case. Having thought about it a bunch more (because of this) I'm proposing we let the compiler decide. The caller doesn't know enough to make that decision. Read carefully. I said we *let* the caller decide, not *make* the caller decide. What, specifically, disturbs you about my proposal? -- John Tobey, late nite hacker [EMAIL PROTECTED] \\\ /// ]]] With enough bugs, all eyes are shallow. [[[ /// \\\
Re: inline mania
Dan Sugalski [EMAIL PROTECTED] wrote: At 05:34 PM 8/1/00 -0400, John Tobey wrote: Okay. For starters, assume that every inline function is called in exactly one place in the translation unit that defines its non-inline counterpart. That one place being, of course, i_foo's foo. This is a natural result of a clean, PI-like-generated source. Bad assumption. How often is av_fill called? By "assume" I mean "ensure" here. As in, this is how we build our library. -- John Tobey, late nite hacker [EMAIL PROTECTED] \\\ /// ]]] With enough bugs, all eyes are shallow. [[[ /// \\\
Re: inline mania
Dan Sugalski [EMAIL PROTECTED] wrote: Bad assumption. How often is av_fill called? Only once in av_fill.c (generated by allfuncs.pl). In most other places, it's called as i_av_fill(). -- John Tobey, late nite hacker [EMAIL PROTECTED] \\\ /// ]]] With enough bugs, all eyes are shallow. [[[ /// \\\
Re: External API's: XS, Pickle, Win32::API, FFI, C::DynaLib etc.
John Porter [EMAIL PROTECTED] writes: On Tue, Aug 01, 2000 at 12:01:52AM -0400, John Tobey wrote: That's a different problem. Configure is trying to reverse engineer header files. Garrett already knows the prototype of his DLL function he wants to call, but, unlike Configure, he doesn't have access to a C compiler. Hate to say it... well, no, I don't really. Wouldn't Tcl be the best example to follow here? Explain what Tcl did (and which version of Tcl). I am aware of how Jan added TIFF support to Tcl/Tk: That approach is: Declare a huge struct: static struct TiffFunctions { VOID *handle; void (* Close) _ANSI_ARGS_((TIFF *)); int (* GetField) _ANSI_ARGS_(TCL_VARARGS(TIFF *, tif)); int (* GetFieldDefaulted) _ANSI_ARGS_(TCL_VARARGS(TIFF *,tif)); TIFF* (* Open) _ANSI_ARGS_((CONST char*, CONST char*)); ... } tiff = {0}; Declare a list of names: static char *symbols[] = { "TIFFClose", "TIFFGetField", "TIFFGetFieldDefaulted", "TIFFOpen", ... (char *) NULL }; Load and lookup all the symbols one by one : if (LoadLib(interp, TIFF_LIB_NAME, tiff.handle, symbols, 10) != TCL_OK) { return TCL_ERROR; } Call via the struct: tif = tiff.Open(tempFileName, "r"); This is horrible - the names are replicated all over the place It still requires a compiler and seems to have all the worst features of all the schemes ... -- Nick Ing-Simmons
Re: inline mania
[EMAIL PROTECTED] wrote: John Tobey wrote: Why is he bothering? A year to produce a prototype doesn't seem like a useful way to expend effort on something that isn't actually perl6. It is actually perl6 if/when it's finished. Right, so it isn't a community effort then, as you intend doing it all yourself. I always welcome patches, suggestions, and constructive criticism. And if somebody else wants the "maintainer" role, I'll be happy to let them do that work for me. :-) I'm not saying it was. However I don't see how the proposal would aid the migration - after all what he is writing will be neither perl5 nor perl6. I am not "writing". I am "transforming". Ok, so neither is it a rewrite. I suspect it won't be called perl6 in any case. Well, if not, I'll have the satisfaction of having tried. Cheers -John
Re: inline mania
On Tue, 1 Aug 2000, John Tobey wrote: The people here are rightly skeptical about the effectiveness of using the 5.6 code base as a starting point for v6, but I have a pretty clear vision of how to do it, and I am committed to giving it a try, even if no one else will. In fact, I'll give you all a tentative schedule: Wait, you're going to develop Perl 6 ALONE? Wasn't this going to be "the community's rewrite of Perl"? Shouldn't you be trying to rally support for your vision before issuing schedules? I'm not trying to knock you - I'm not at all against hearing you plans and possibly helping out. This just seems like a pretty strange way to approach a community effort. 15 August 2000 - detailed draft spec to perl6-internals. 31 August 2000 - revised spec after discussion. What? You're expecting all the various perl6-* lists to come up with final RFCs be the end of the month? And you're expecting to have Larry's final plans by then? Or are you going to implement Perl 6 without knowing what it is? Unicode and threading would become integrable only after a lot of morphing (refactoring). The morphing would probably destroy any traces of v5 unicode support (since well under 20% of test scripts will notice), and of course 5005threads will be the first to go. With any luck, a compatible, well-integrated replacement will eventually take its place. This sounds hopeful, but mostly unfounded. Without starting with threading and Unicode as primary features you're going to be fighting an uphill battle ala Perl 5. -sam
Re: RFC: On-the-fly tainting via $^T
At 11:57 PM 7/31/00 -0700, Matthew Cline wrote: Something else which might be useful for tainting would be something like: taint_var($foo); no_taint_var($bar); With this, any value assigned to $foo would become tainted, and any value assigned to $bar would become untainted. While this is certainly doable (heck, you can do it now with tied variables), I'm not at all comfortable with a magic untainting variables. I think it's a rather bad idea. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: RFC: On-the-fly tainting via $^T
I respectfully request that one list be picked for this topic and discussion confined to that one list even if it should occasionally spill into the other bailiwick. Or perhaps it's a candidate for a new working group. If all messages are CC:ed to all lists, then simply have p5p reborn (and the hassle of filtering duplicates). Nat
Re: $^O and $^T variables (on-the-fly tainting)
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS $Config{osname}, I think. I'm not thrilled with that, mainly because it DS means loading up Config, which ain't cheap. Why not have $Config hard coded into the executable? Perl has to have it or know it. So why not make it part of the executable. Then moving an executable around would carry the correct luggage. However, this got me thinking. Here is an idea I'd like to see: The existence of a $^T variable for controlling tainting in the same way that $^W controls warnings. Now *that* would be cool. I realize the current implementation of tainting requires it starts with the interpreter, but hey we're rewriting the internals, right? DS So put in an RFC. :) Seriously, finer grain control over tainting's not DS unreasonable, and I can think of a few ways to do it, but they need to be DS designed in *now*, not later. Just remember, Larry's dislike of making untainting easy. I'd rather not have multiple characters. A option hash or even a longer namespace would be more readable. $Perl::Warnings{undef} = 1; $Perl::Tainting = 1; chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Summary...tell me where I'm worng...
Just trying to catch up. This is where I understand the discussion stands: INTERNALS(?) modular language: Scanner/Symbol Table/Parser/Executor Standard Functions separate from core (moving to language?) Modules Separate from everything (definitely language) Strict(er) DataTypes: Automatic Type Conversion(?) (internal or language?) Native Size Allocation (Internal or language?) Items still under general discussion: Formats (probably language if it stays) Garbage Collection (internals?) RegEx (internals?) localtime() (arrays start at 0 or 1) (language) Backward compatibility in general (who knows) If someone could just tell me where these discussions go (as many aren't really defined yet) I would be grateful. Also, If I have missed anything let me know, Grant M. [EMAIL PROTECTED]
Re: RFC: On-the-fly tainting via $^T
Please explain how having a no taint block would still keep the spirit of not making untainting easy? Just add a no taint at the top of ones code, and the -T goes away. chaim "DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS I think I'd prefer to leave untainting to regexes. DS What I was thinking of was something along the lines of a lexically scoped DS pragma--"use taint"/"no taint". (We could do this by sticking in an opcode DS to set/unset the tainting status, as well as the warning status, and so on) DS Taint checking is disabled in a no taint block. Whether we still set the DS taint status on a scalar could depend on the -T switch, so data would still DS be tainted in a no taint block. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC: On-the-fly tainting via $^T
On Mon, Jul 31, 2000 at 10:42:54PM -0700, Nathan Wiger wrote: Dan Sugalski wrote: existence of a $^T variable for controlling tainting in the same way that $^W controls warnings. So put in an RFC. :) Dan- Ask and ye shall receive...in POD format ala Tim... I think this is more perl6-language than perl6-internals. Tim.
RE: Summary...tell me where I'm worng...
Perl's regex syntax in not poorly documented (IMHO, of couse). Some of the new stuff is. (Poorly documented, that is.) MRE made me feel like a ghod (after I read chapter 7 for the third time ;) Some of the new stuff's not in MRE, which is, I suppose, partly why Jeffrey Friedl's working on a new edition (and trawling up a rich haul of bugs in the process -- see the p5p list). Of course, documentation in MRE II would lay that particular complaint of mine to rest (six feet deep in a in a lead-lined casket). -- Dominic Dunlop
Re: type-checking [Was: What is Perl?]
At 09:25 PM 8/1/00 +, Nick Ing-Simmons wrote: Alan Burlison [EMAIL PROTECTED] writes: No, I disagree. Perl gains a lot of its expressive power from being lax about typing. I suspect it will also impose an unacceptable overhed for the vast majority who don't want it - at the very least every variable access will have to check an 'are you typed' flag. Cross posted to internals ('cos it is...) We should consider using "vtables" to avoid the cost of the conditional branches (and running out of flag bits). Works for me. Anyone care to flesh this out a bit? (Take pity on a guy who's not had C++ inflicted upon him... :) Thus this function would call variables "type check" "method" - which for normal case would be pointer to blue-white-hot "NoOp" function which is near always in-cache, for a typed var it could be a slow as you wanted... I was thinking that, since the compiler has most of the information, a "type check" opcode could be used, and inserted only where needed. If, for example, you had: my ($foo, $bar); my ($here, $there) : Place; $foo = $bar; $here = $there; You'd only need to typecheck the assignment to $here. Granted assignments through references would need to check unconditionally, and a typecheck's in order if you're not sure of the types (say, by directly referencing @_ elements), but the optimizer could certainly toss a lot of checks. Strong typing could also get us a win other places. If, for example, we said: my @foo : integer : strict; meant that @foo *only* has integers in it, we don't need to store full SVs in it, and use an integer array only. If strict's off I don't see any reason to forbid bad assignments of 'known' types--if someone, using the above example, did a "$foo[2] = 'bar'" I don't see any reason not to make $foo[2] have a value of 0. (With a warning emitted by -w, of course) Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Summary...tell me where I'm worng...
On Tue, Aug 01, 2000 at 07:03:42AM -0400, Grant M. wrote: Just trying to catch up. This is where I understand the discussion stands: INTERNALS(?) modular language: Scanner/Symbol Table/Parser/Executor Internals. Standard Functions separate from core (moving to language?) Some of each. Modules Separate from everything (definitely language) Yes. Strict(er) DataTypes: Automatic Type Conversion(?) (internal or language?) Native Size Allocation (Internal or language?) Language for now. Items still under general discussion: Formats (probably language if it stays) Garbage Collection (internals?) RegEx (internals?) Yes, Yes, Yes. localtime() (arrays start at 0 or 1) (language) Yes. Backward compatibility in general (who knows) Script Backward compatibility = language. XS Backward compatibility = here (later) if someone volunteers to write the code to make old XS code work with the new APIs. If someone could just tell me where these discussions go (as many aren't really defined yet) I would be grateful. Also, I'd say "if in doubt then it's not for perl6-internals, at least not for now". I'd also say there's not much point at the moment in discussing details of implementing features that we're not pretty sure will be in the language. I think there's _lot's_ of valuable work we can do here we can do here prior to the language being firmed up. If we start getting into details of other things we won't make progress on the basics, like vtable interfaces for SV and libraries, analysis of GC implementations etc. We need to be pretty sure of most of those kind of issues by the time the language gets firmed up. Tim.
Re: RFC: On-the-fly tainting via $^T
Simon Cozens [EMAIL PROTECTED] wrote: On Tue, Aug 01, 2000 at 01:43:05PM +0100, Graham Barr wrote: Let me just say that Larry has said in the past that untainting was deliberatly left difficult to do, on the basis that something which can have serious effect (ie security) should not be easy to do. But then I suppose all previous decisions are up for re-deciding Yes, they are. If we're going to make it trivially easy to untaint, should we bother having tainting at all? :( Tainting has potential uses as data-tracking mechanism aside from security. If the keyword 'untaint' had to appear, it would be easier to find security issues than when m/(.*)/ is used. Uh-oh, now we're getting back into perl6-language territory... attempting to CC. -- John Tobey, late nite hacker [EMAIL PROTECTED] \\\ /// ]]] With enough bugs, all eyes are shallow. [[[ /// \\\
Re: type-checking [Was: What is Perl?]
Alan Burlison [EMAIL PROTECTED] writes: No, I disagree. Perl gains a lot of its expressive power from being lax about typing. I suspect it will also impose an unacceptable overhed for the vast majority who don't want it - at the very least every variable access will have to check an 'are you typed' flag. Cross posted to internals ('cos it is...) We should consider using "vtables" to avoid the cost of the conditional branches (and running out of flag bits). Thus this function would call variables "type check" "method" - which for normal case would be pointer to blue-white-hot "NoOp" function which is near always in-cache, for a typed var it could be a slow as you wanted... -- Nick Ing-Simmons
Re: Summary...tell me where I'm worng...
At 15:19 +0100 2000-08-01, Tim Bunce wrote: RegEx (internals?) Yes, Yes, Yes. I could argue for regex being language too: it's a little language, and it's got very crufty of late. Yes, it's sexy cruft which can be justified because it allows one to do neat things which were previously difficult or impossible (or merely verbose). But it's also more or less poorly documented, more or less poorly understood, more or less well-used, and more or less poorly tested. (Indeed, some of it's still marked as experimental.) If the language group is going to give each of perl's reserved words (and much else besides) the third degree so as to decide whether it should stay in the core, be cast into outer darkness, or end up somewhere in between, then I'd say that the same should be done for the language supported by the regex engine. Once the language group has decided what are the required and optional features of the regex language, then the internals folks can start working out how to make it fly (or tell the language folks it won't). Pluggable regex engines would make supporting (say) core and optional regex language features easier. Want me to write this up as an RFC? -- Dominic Dunlop
Inner loop (was Re: type-checking [Was: What is Perl?])
May I offer an alternative. Why do an interpreter? I remember reading good things about Threaded Interpreters (e.g. Forth) So why not do a TIL? Compile it to machine calls/jumps. Should be much faster than the inner run loop. This would fit in with Dan and Nick's keep it in cache. So there could be several different runtime stages. On machines where perl knows how to assemble machine instructions do it in raw executable code. On machines where perl doesn't know how to. Write an small assembler language stub that does the Threaded code. On those machines where we can't even do that. Interpret the threaded code in C. chaim "DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS I was thinking that, since the compiler has most of the information, a DS "type check" opcode could be used, and inserted only where needed. If, for DS example, you had: DSmy ($foo, $bar); DSmy ($here, $there) : Place; DS$foo = $bar; DS$here = $there; -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Stuff in core (was Re: date interface, on language (was Re: perl6 requirements, on bootstrap))
On Tue, Aug 01, 2000 at 11:37:49PM -0400, Dan Sugalski wrote: Right. That was my point. (The original poster wanted to pull IO out of the core entirely) Ah. Barbarians-at-gates approach, then. On the other hand, there is a lot of rubbish that *can* go out of core; I'd like to see core being syntax-plus-essentials. System V IPC, for instance, isn't essential. I think "essential" could be easily defined as "stuff which is portable pretty much everywhere"; the socket stuff can go into a separate library, for instance. This probably wouldn't affect speed too drastically because these things could always be linked in statically a la Dynaloader. But this is now an internals issue, so the list football starts again. Don't you just love arbitrary distinctions? -- VMS must die!