RE: Windows compile problems
On Wed, 24 Oct 2001, Brent Dax wrote: Unfortunately, I can't figure out how to utilize it. Including windows.h causes a conflict with Parrot's definition of BOOL, including winbase.h gives me a ton of syntax errors, and putting the declaration It is not supported to #include a win* file unless you have already included windows.h . Regards Mattia
RE: Revamping the build system
Brent Dax: What about little inline things? AUTO_OP sleep(i|ic) { #ifdef WIN32 Sleep($1*1000); #else sleep($1); #endif } As long as the file compiles on all platforms, I think it's logical to consider it platform independant :-) Brent Dax: Would you demand that that be put in a separate file? (As a matter of fact it can't be--ops2c.pl isn't equipped for that sort of thing.) Where would you draw the line? Place things that are decidedly platform specific in a separate directory # 3. create an initial SIMPLE makefile and a config.h for each supported platform/compiler combination Brent Dax: Problem with that is, some platforms don't have make or have bad makes. Neither nmake nor pmake works well enough on Win32 (and dmake uses a different syntax). Well, the current crop of makefiles for the Win32 platform isn't exactly simple - and if you try a build using the dmake/Borland C++ Builder 5 combination you will find that some files obviously are out of date. What I am thinking of is the situation where you don't have a Perl binary and want to bootstrap the build process. Skip include file dependencies and just get the makefile to build and link an initial binary capable of executing a parrot binary for a platform independant make. VMS almost always uses mms or mmk (and even if they had a normal make, I dare you to write a Makefile that will run there and on other platforms). Most Macs don't even have a command line or compiler, let alone make. You won't find such things available on handhelds either. Thats why I would like to see a sparate initial makefile for each platform/compiler combination Personally, I think we should write a shell script (or equivalent) for each platform that simply invokes the compiler to build miniperl, and we can do whatever we need from there. That would also be something likely to work, as long as the shell script is written for a shell shipped with that platform. In this case would like to see a separate shell script (or equivalent) for each compiler/platform combination. A safe config.h could look like: typedef long INTVAL; typedef double FLOATVAL; typedef long opcode_t; #undef HAS_HEADER_* etc. Something like that ought to work on any platform; if necessary we can use #ifdefs with OS symbols (#ifdef WIN32, etc.) to figure it out. All miniperl does is figure out whatever Configure figures out currently and builds everything. (Also, we may want to write it so it looks for a Perl 5 or Perl 6 that's already installed and hands things off to that if possible.) When you think about it, how much functionality do we need? Do we need much of anything OS-dependent besides simple IO, -X operators and system() to emulate make? Do we even need to be as smart as make? Is there really a problem with stupidly rebuilding everything, even if it isn't all necessary? No, I agree # I know this isn't hightech, but it works like a charm. # # 4. write all other build tools in Perl Great. How are we going to do this? We can't depend on having a working Perl around at the beginning of the build process. A parrot binary is going to be platform independant - right ?? So what we want is tools as parrot binaries, and a miniparrot, possible created using a native shell script, capable of executing them # 5. use uuids to identify packages, not name, this way my # MY::TextModule and # your MY::TextModule can be identified as two different # packages, OR require # that I do something like harlinn::no::MY::TextModule when I name my # packages/modules. Huh? Oh, you're talking about namespace conflicts. I don't think there's much we can do about that, except the official list on the CPAN we already have. If we would like to create something for other languages besides Perl6 I think some thought should be given to this. # To test for the presence of a particular library and # associated include # files maintain a list of filenames # for each supported platform/compiler combination. Like: # # ACE: LIB=E:\src\Corba\ACE_wrappers\bin\ace.lib; # INCLUDE=E:\src\Corba\ACE_wrappers;E:\src\Corba\ACE_wrappers\TAO # TCL: LIB=C:\Tcl\lib\tcl83.lib INCLUDE=C:\Tcl\include # DEFINES=WIN32;WINNT=1 // a comment # DB2: LIB=C:\SQLLIB\lib\db2api.lib;C:\SQLLIB\lib\db2cli.lib # INCLUDE=C:\SQLLIB\include # # and so on ... # # or in other words: # platform independent package name: LIB=[optional fullpath to # library[;optional fullpath to next library]] # INCLUDE=[optional fullpath of directory[;optional # fullpath to next # directory]] # DEFINES=NAME1=VALUE1;NAME2=VALUE2 // Comments # # My point is that the format of this file should be kept # really simple and # used during the next stage of the build process # to generate the final build. If a package is missing from # this file, then # it's not included in the final build. Huh? I don't get what this is
Re: Revamping the build system
In perl.perl6.internals, you wrote: Brent Dax [EMAIL PROTECTED] writes: What about little inline things? AUTO_OP sleep(i|ic) { #ifdef WIN32 Sleep($1*1000); #else sleep($1); #endif } This reminds me. gcc is slowly switching over to writing code like that as: if (WIN32) { Sleep($1*1000); } else { sleep($1); } or the equivalent thereof instead of using #ifdef. If you make sure that the values are defined to be 0 or 1 rather than just defined or not defined, it's possible to write code like that instead. If I recall correctly, Plan9's C compiler doesn't do #ifdef at all! The perl5 source (#ifdef forest) was munged into the second form. It may not be possible to use this in cases where the not-taken branch may refer to functions that won't be prototyped on all platforms, depending on the compiler, but there are at least some places where this technique can be used, and it's worth watching out for. Yes, a number of the #ifdef branches in perl5's pp_sys.c would have this problem (odd structs present on some systems but not others, for example). Also the VMS code with $ signs often gives other compilers heartburn. (In the case above, I'd probably instead define a sleep function on WIN32 that calls Sleep so that the platform differences are in a separate file, but there are other examples of things like this that are better suited to other techniques.) Yes, that's what perl5 traditionally often tried to do. (See, for example, the various defines in unixish.h: fwrite1, Stat, Fstat, Fflush, Mkdir). Of course perl5 itself hasn't even always followed that plan . . . . -- Andy Dougherty [EMAIL PROTECTED] Dept. of Physics Lafayette College, Easton PA 18042
RE: Revamping the build system
Espen Harlinn: # Brent Dax: # What about little inline things? # # AUTO_OP sleep(i|ic) { # #ifdef WIN32 # Sleep($1*1000); # #else # sleep($1); # #endif # } # # As long as the file compiles on all platforms, I think it's logical to # consider it platform independant :-) AUTO_OP sleep(i|ic) { #ifdef WIN32 SleepEx($1*1000, NULL); #endif #ifdef VMS #ifdef __VAX proc_sleep($1); #else proc_sleep2($1, NULL); #endif #endif #ifdef MACOS process_pause($1*100); #endif sleep($1); } Is that platform-independent? (No, I'm not saying that's what's needed to do sleep, just giving an example. But look through the Perl 5 source and you'll find things that make this look pretty.) # Brent Dax: # Would you demand that that be put in a separate file? (As # a matter of # fact it can't be--ops2c.pl isn't equipped for that sort of thing.) # Where would you draw the line? # Place things that are decidedly platform specific in a # separate directory Fair enough. # # 3. create an initial SIMPLE makefile and a config.h for # each supported platform/compiler combination # # Brent Dax: # Problem with that is, some platforms don't have make or have # bad makes. # Neither nmake nor pmake works well enough on Win32 (and dmake uses a # different syntax). # Well, the current crop of makefiles for the Win32 platform # isn't exactly # simple - and if you try a build using the dmake/Borland C++ Builder 5 # combination you will find that some files obviously are out of date. # # What I am thinking of is the situation where you don't have a # Perl binary # and want to bootstrap the build process. Skip include file # dependencies and # just get the makefile to build and link an initial binary capable of # executing a parrot binary for a platform independant make. But once again, we can't depend on make existing. That's why I'm suggesting shell scripts--virtually all platforms have something like them. Even Macs have AppleScript. # VMS almost always uses mms or mmk (and # even if they # had a normal make, I dare you to write a Makefile that will # run there # and on other platforms). Most Macs don't even have a # command line or # compiler, let alone make. You won't find such things available on # handhelds either. # Thats why I would like to see a sparate initial makefile for each # platform/compiler combination That seems like a lot of extra work. Do we really want to have separate 'install' (for lack of a better name) scripts where all we did was Cs/$compiler_name_1/$compiler_name_2/g? # Personally, I think we should write a shell script (or # equivalent) for # each platform that simply invokes the compiler to build # miniperl, and we # can do whatever we need from there. # # That would also be something likely to work, as long as the # shell script is # written for a shell shipped with that platform. # In this case would like to see a separate shell script (or # equivalent) for # each compiler/platform combination. That the script would be written for a shell on that platform is kinda assumed. Once again, I think that most compilers' calling semantics are similar enough that we will often just have to change the name of the command, so why add extra scripts to maintain? # A safe config.h could look like: # # typedef long INTVAL; # typedef double FLOATVAL; # typedef long opcode_t; # # #undef HAS_HEADER_* # # etc. Something like that ought to work on any platform; if # necessary we # can use #ifdefs with OS symbols (#ifdef WIN32, etc.) to # figure it out. # All miniperl does is figure out whatever Configure figures # out currently # and builds everything. (Also, we may want to write it so it # looks for a # Perl 5 or Perl 6 that's already installed and hands things # off to that # if possible.) # # When you think about it, how much functionality do we need? # Do we need # much of anything OS-dependent besides simple IO, -X operators and # system() to emulate make? Do we even need to be as smart # as make? Is # there really a problem with stupidly rebuilding everything, # even if it # isn't all necessary? # No, I agree Good, we agree on something. :^) # # I know this isn't hightech, but it works like a charm. # # # # 4. write all other build tools in Perl # # Great. How are we going to do this? We can't depend on having a # working Perl around at the beginning of the build process. # A parrot binary is going to be platform independant - right ?? # So what we want is tools as parrot binaries, and a # miniparrot, possible # created using a native shell script, capable of executing them If you mean bytecode, that's true I suppose. At the very beginning of the build, all we can depend on is $cc and shell (or equivalent) scripts. # # 5. use
Re: Windows compile problems
In perl.perl6.internals, you wrote: On Wed, 24 Oct 2001, Brent Dax wrote: Unfortunately, I can't figure out how to utilize it. Including windows.h causes a conflict with Parrot's definition of BOOL, including Then we probably should change Parrot's name of BOOL. I'd suggest Bool_t, modeled after perl5's Size_t (and similar types). Perl5 could actually use Bool_t, so if anyone implements such a test, back-porting it to perl5 would be appreciated. -- Andy Dougherty [EMAIL PROTECTED] Dept. of Physics Lafayette College, Easton PA 18042
Chr Ord, v0.4
Hey all. This is version 0.4 of my chr and ord patch for parrot. Included is a patch, a test file, and an example. I don't really see any major problems with this version, at least that aren't implicit in the current Way Of Things with strings. (That is, native not being explicitly anything, and the encodings list being static.) Chr and Ord aren't implemented for utf8 and utf16, only for native and utf32. I'd much appreciate it if sombody who knew what they were doing did this. The tests are woefuly incomplete. The style of the example is poor. -=- James Mastros Index: core.ops === RCS file: /home/perlcvs/parrot/core.ops,v retrieving revision 1.18 diff -u -r1.18 core.ops --- core.ops2001/10/24 14:54:54 1.18 +++ core.ops2001/10/25 13:38:35 @@ -991,6 +991,43 @@ $1 = string_substr(interpreter, $2, $3, $4, $1); } + + +=item Bord(i, s) +=item Bord(i, sc) + +Set $1 to the codepoint of the first character in $2. + +=cut + +AUTO_OP ord(i, s|sc) { + $1 = string_ord($2); +} + + + +=item Bchr(s, i) +=item Bchr(s, ic) + +Set $1 to a single-character string with the Unicode codepoint $2. + +=cut + +AUTO_OP chr(s, i|ic) { +$1 = string_chr(interpreter, $2, enc_utf32, $1); +} + + + +=item Bchr(s, i|ic, i|ic) + +Set $1 to a single-character string with the codepoint $2 in the encoding $3. + +=cut + +AUTO_OP chr(s, i|ic, i|ic) { +$1 = string_chr(interpreter, $2, $3, $1); +} =back Index: string.c === RCS file: /home/perlcvs/parrot/string.c,v retrieving revision 1.15 diff -u -r1.15 string.c --- string.c2001/10/22 23:34:47 1.15 +++ string.c2001/10/25 13:38:35 @@ -168,6 +168,32 @@ return (ENC_VTABLE(s1)-compare)(s1, s2); } +/*=for api string string_ord + * get the codepoint of the first char of the string. + * (FIXME: Document in docs/strings.pod) + */ +INTVAL +string_ord(STRING* s) { + return (ENC_VTABLE(s)-ord)(s); +} + +/*=for api string string_chr + * Get a string with the first char having codepoint code, in the encoding + * enc, and store it in d. Also return d. + * Allocate memory for d if necessary. + */ +STRING* +string_chr(struct Parrot_Interp *interpreter, INTVAL code, encoding_t enc, STRING** +d) { +STRING *dest; +if (!d || !*d) { +dest = string_make(interpreter, NULL, 0, enc, 0, 0); +} +else { +dest = *d; +} +return (ENC_VTABLE(dest)-chr)(code, dest); +} + /* * Local variables: * c-indentation-style: bsd @@ -176,9 +202,4 @@ * End: * * vim: expandtab shiftwidth=4: -*/ - - - - - + */ Index: strnative.c === RCS file: /home/perlcvs/parrot/strnative.c,v retrieving revision 1.19 diff -u -r1.19 strnative.c --- strnative.c 2001/10/22 23:34:47 1.19 +++ strnative.c 2001/10/25 13:38:35 @@ -105,6 +105,32 @@ return cmp; } +/*=for api string_native string_native_ord + returns the value of the first byte of the string. + */ +INTVAL +string_native_ord (STRING* s) { + return (INTVAL)*(char *)(s-bufstart); +} + +/*=for api string_native string_native_chr + return a string whose first character is given by the INTVAL. +*/ +STRING* +string_native_chr (INTVAL code, STRING* dest) { + if (dest-encoding-which != enc_native) { + /* It is now, matey. */ + dest-encoding = (Parrot_string_vtable[enc_native]); + } + + string_grow(dest, 1); + *(char *)dest-bufstart = (char)code; + dest-strlen = 1; + dest-bufused = 1; + + return dest; +} + /*=for api string_native string_native_vtable return the vtable for the native string */ @@ -118,6 +144,8 @@ string_native_chopn, string_native_substr, string_native_compare, +string_native_ord, + string_native_chr, }; return sv; } Index: strutf32.c === RCS file: /home/perlcvs/parrot/strutf32.c,v retrieving revision 1.4 diff -u -r1.4 strutf32.c --- strutf32.c 2001/10/22 23:34:47 1.4 +++ strutf32.c 2001/10/25 13:38:35 @@ -102,6 +102,32 @@ return cmp; } +/*=for api string_native string_utf32_ord + returns the value of the first byte of the string. + */ +INTVAL +string_utf32_ord (STRING* s) { + return (INTVAL)*(utf32_t *)(s-bufstart); +} + +/*=for api string_utf32 string_utf32_chr + return a string whose first character is given by the INTVAL. +*/ +STRING* +string_utf32_chr (INTVAL code, STRING* dest) { + if (dest-encoding-which != enc_utf32) { + /* It is now, matey. */ + dest-encoding = (Parrot_string_vtable[enc_utf32]); + } + + string_grow(dest, 1); + *(utf32_t *)dest-bufstart = (utf32_t)code; + dest-strlen = 1; + dest-bufused = 4; + + return dest; +} + /*=for api
Re: Are threads what we really want ???
At 02:28 AM 10/25/2001 +0200, Espen Harlinn wrote: Instead of thinking about multiple threads, one could think about multiple execution contexts. Each instance of an object must belong to one and only one execution context. Each execution context has an attached security context and a security manager. One actually needs to think about both. Threads and execution contexts aren't required to be related. You could have multiple threads in a single execution context (though it works badly with high-level languages as we found with perl 5's pthread model, but that's a separate issue) or multiple execution contexts with a single thread, which is what happens when you allow a process to create multiple interpreters. Parrot will support the single-thread/multiple-interpreter and multiple-thread/multiple-interpreter models. (Where there's a 1:1 relationship between those multiple threads and multiple interpreters) Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Windows compile problems
At 08:59 AM 10/25/2001 -0400, Andy Dougherty wrote: In perl.perl6.internals, you wrote: On Wed, 24 Oct 2001, Brent Dax wrote: Unfortunately, I can't figure out how to utilize it. Including windows.h causes a conflict with Parrot's definition of BOOL, including Then we probably should change Parrot's name of BOOL. I'd suggest Bool_t, modeled after perl5's Size_t (and similar types). Sounds like a good idea. Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
HP-UX 11.00 still not happy
l1:/pro/3gl/CPAN/parrot-current 102 make distclean perl -MExtUtils::Manifest=filecheck -le 'xtUtils::Manifest::Quiet=1;unlink for filecheck()' Undefined subroutine xtUtils::Manifest::Quiet called at -e line 1. make: *** [distclean] Error 255 l1:/pro/3gl/CPAN/parrot-current 103 rm -f *.o *.a l1:/pro/3gl/CPAN/parrot-current 104 perl Co Config_pm.in Configure.pl l1:/pro/3gl/CPAN/parrot-current 104 perl Configure.pl --default : l1:/pro/3gl/CPAN/parrot-current 105 make test_prog perl vtable_h.pl : cc -DDEBUGGING -Ae -D_HPUX_SOURCE -I/pro/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I./include -o stacks.o -c stacks.c cc: stacks.c, line 105: warning 604: Pointers are not assignment-compatible. : cc -DDEBUGGING -Ae -D_HPUX_SOURCE -I/pro/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I./include -o vtable_ops.o -c vtable_ops.c cc: vtable_ops.c, line 37: error 1534: Illegal to use a function pointer as + operand where an arithmetic type is required. cc: vtable_ops.c, line 37: error 1533: Illegal function call. cc: vtable_ops.c, line 43: error 1534: Illegal to use a function pointer as + operand where an arithmetic type is required. cc: vtable_ops.c, line 43: error 1533: Illegal function call. cc: vtable_ops.c, line 49: error 1534: Illegal to use a function pointer as + operand where an arithmetic type is required. cc: vtable_ops.c, line 49: error 1533: Illegal function call. cc: vtable_ops.c, line 55: error 1534: Illegal to use a function pointer as + operand where an arithmetic type is required. cc: vtable_ops.c, line 55: error 1533: Illegal function call. cc: vtable_ops.c, line 61: error 1534: Illegal to use a function pointer as + operand where an arithmetic type is required. cc: vtable_ops.c, line 61: error 1533: Illegal function call. cc: vtable_ops.c, line 67: error 1534: Illegal to use a function pointer as + operand where an arithmetic type is required. cc: vtable_ops.c, line 67: error 1533: Illegal function call. make: *** [vtable_ops.o] Error 1 l1:/pro/3gl/CPAN/parrot-current 106 cat .timestamp 1003950001 Wed Oct 24 19:00:01 2001 UTC (time of this cvs update) l1:/pro/3gl/CPAN/parrot-current 107 -- H.Merijn BrandAmsterdam Perl Mongers (http://www.amsterdam.pm.org/) using perl-5.6.1, 5.7.2 629 on HP-UX 10.20 11.00, AIX 4.2, AIX 4.3, WinNT 4, Win2K pro WinCE 2.11. Smoking perl CORE: [EMAIL PROTECTED] http:[EMAIL PROTECTED]/ [EMAIL PROTECTED] send smoke reports to: [EMAIL PROTECTED], QA: http://qa.perl.org
String rationale
'Kay, here's the string background info I promised. If things are missing or unclear let me know and I'll fix it up until it is. ==Cut here with a very sharp knife=== =head1 TITLE A parrot string backgrounder =head1 Overview Strings, in parrot, are compartmentalized, the same way so much else in Parrot is compartmentalized. There's no single 'blessed' string encoding--the closest we come is Unicode, and only as an encoding of last resort. (Unicode's not a good interchange format, as it loses information) =head2 From the Outside On the outside, the interpreter considers strings to be a sort of black box. The only bits of the interpreter that much care about the string data are the regex engine parts, and those only operate on fixed-sized data. The interpreter can only peek inside a string if that string is of fixed length, and the interpreter doesn't actually care about the character set the data is in. All character sets must provide a way to transcode to Unicode, and all character encodings must provide a way to turn their characters into fixed-sized entities. (The size may be 8, 16, or 32 bits as need be for the character set) Character sets may provide a way to transcode to non-Unicode sets, for example from EBCDIC to ASCII, but this is optional. If none is provided a transcoding from one set to another will use Unicode as an intermediate form, complete with potential data loss. All character sets must provide the character lists the regular expression engine needs for the base character classes. (space, word, and digit characters) This permits the regular expression code to operate on the contents of a string without needing to know its actual character set. =head2 From the Inside =head2 Technical details The base string structure looks like: struct parrot_string { void *bufstart; INTVAL buflen; INTVAL bufused; INTVAL flags; INTVAL strlen; STRING_VTABLE* encoding; INTVAL type; INTVAL lanugage; } =head2 Fields =over 4 =item bufstart Where the string buffer starts =item buflen How big the buffer is =item bufused How much of the buffer's used =item flags A variety of flags. Low 16 bits reserved to Parrot, the rest are free for the string encoding library to use =item strlen How long the string is in code points. (Note that, for encodings that are more than 8 bits per code point, or of variable length, this will Enot be the same as the buffer used. =item encoding Pointer to the library that handles the string encoding. Encoding is basically how the stream of bytes pointed to by Cbufstart can be turned into a stream of 32-bit codepoints. Examples include UTF-8, Big 5, or Shift JIS. Unicode, Ascii, or EBCDIC are Bnot encodings.first =item type What the character set or type of data is encoded in the buffer. This includes things like ASCII, EBCDIC, Unicode, Chinese Traditional, Chinese Simplified, or Shift-JIS. (And yes, I know the latter's a combination of type and encoding. I'll update the doc as soon as I can reasonablty separate the two) =item language The language the string is in. This is essential for proper sorting, if a sort function wants to be language-aware. Just an encoding/type is insufficient for proper sorting--for example knowing a string is UTF-32/Unicode doesn't tell you how the data should be ordered. This is especially important for those languages that overlap in the Unicode code space. Japanese and Chinese, for example, share many of the Unicode code points but sort those code points differently. =back Libraries for processing character sets and encodings are shareable libraries, and may be loaded on demand. They are looked up and referenced by name. An identifying number is given to them at load time and shouldn't be used outside the currently running process. (EBCDIC might be character set 3 in one run and set 7 in another) The native encoding and character set is Inever considered a 'real' encoding or character set. It just specifies what the default is if nothing else is specified, but when bytecode is frozen to disk the actual encoding or set name will be used instead. Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: String rationale
On Thu, 25 Oct 2001, Dan Sugalski wrote: The only bits of the interpreter that much care about the string data are the regex engine parts, and those only operate on fixed-sized data. Care to elaborate? I thought the mandate from Larry was to have regexes compile down to a stream of string ops. Doesn't that mean it should work regardless of the encoding of the string? The interpreter can only peek inside a string if that string is of fixed length, and the interpreter doesn't actually care about the character set the data is in. Why is this necessary at all? Wouldn't it be prefereable to have all access go through the String vtable regardless of the encoding? =item encoding Pointer to the library that handles the string encoding. Encoding is basically how the stream of bytes pointed to by Cbufstart can be turned into a stream of 32-bit codepoints. Examples include UTF-8, Big 5, or Shift JIS. Unicode, Ascii, or EBCDIC are Bnot encodings.first .first? Aside from the above, this was a nice refresher. -sam
Re: String rationale
At 12:19 PM 10/25/2001 -0400, Sam Tregar wrote: On Thu, 25 Oct 2001, Dan Sugalski wrote: The only bits of the interpreter that much care about the string data are the regex engine parts, and those only operate on fixed-sized data. Care to elaborate? I thought the mandate from Larry was to have regexes compile down to a stream of string ops. Doesn't that mean it should work regardless of the encoding of the string? Since the encoding just determines how the abstract code point numbers are represented in bytes, I'm OK with requiring strings we process internally to be in a fixed-size version. And regexes will be done with a stream of parrot opcodes, presuming that's not too slow. There'll be ops to reference the code point at position X in a string and check to see if its in a list of other code points and suchlike things. Basically we'll peek under the covers, but only for fixed-length strings. The interpreter can only peek inside a string if that string is of fixed length, and the interpreter doesn't actually care about the character set the data is in. Why is this necessary at all? Wouldn't it be prefereable to have all access go through the String vtable regardless of the encoding? Speed. We're going to take something of a hit decomposing to ops as it is--if we can safely cheat, I'm OK with mandating it to be required. :) =item encoding Pointer to the library that handles the string encoding. Encoding is basically how the stream of bytes pointed to by Cbufstart can be turned into a stream of 32-bit codepoints. Examples include UTF-8, Big 5, or Shift JIS. Unicode, Ascii, or EBCDIC are Bnot encodings.first .first? Trailing buffer gook. Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Windows compile problems
Dan Sugalski [EMAIL PROTECTED] writes: At 08:59 AM 10/25/2001 -0400, Andy Dougherty wrote: Then we probably should change Parrot's name of BOOL. I'd suggest Bool_t, modeled after perl5's Size_t (and similar types). Sounds like a good idea. IIRC, all types ending in _t are reserved by POSIX and may be used without warning in later versions of the standard. (This comes up not infrequently in some of the groups I read, but I unfortunately don't have a copy of POSIX to check for myself and be sure.) -- Russ Allbery ([EMAIL PROTECTED]) http://www.eyrie.org/~eagle/
Re: Windows compile problems
At 12:24 PM 10/25/2001 -0700, Russ Allbery wrote: Dan Sugalski [EMAIL PROTECTED] writes: At 08:59 AM 10/25/2001 -0400, Andy Dougherty wrote: Then we probably should change Parrot's name of BOOL. I'd suggest Bool_t, modeled after perl5's Size_t (and similar types). Sounds like a good idea. IIRC, all types ending in _t are reserved by POSIX and may be used without warning in later versions of the standard. (This comes up not infrequently in some of the groups I read, but I unfortunately don't have a copy of POSIX to check for myself and be sure.) Ah, good point. Maybe we should go with _p as a suffix rather than _t. (the p for parrot, of course) Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
[PATCH] Exceptions as promised...
The included patch requires a new file t/op/exceptions.t, which tests basic exception handling, in this case divide-by-zero. Patch was generated against latest CVS, but it shouldn't matter -that- much. -Jeff [EMAIL PROTECTED] diff --recursive -C 2 parrot_cvs/MANIFEST parrot/MANIFEST *** parrot_cvs/MANIFEST Wed Oct 24 07:36:57 2001 --- parrot/MANIFEST Wed Oct 24 07:37:22 2001 *** *** 108,111 --- 108,112 t/op/basic.t t/op/bitwise.t + t/op/exception.t t/op/integer.t t/op/number.t Only in parrot/: Makefile diff --recursive -C 2 parrot_cvs/Parrot/Assembler.pm parrot/Parrot/Assembler.pm *** parrot_cvs/Parrot/Assembler.pm Wed Oct 24 07:36:57 2001 --- parrot/Parrot/Assembler.pm Wed Oct 24 07:54:22 2001 *** *** 110,114 =cut ! my(%type_to_suffix)=('I'='i', 'N'='n', 'S'='s', 'P'='p', 'i'='ic', 'n'='nc', --- 110,115 =cut ! my(%type_to_suffix)=('E'='e', !'I'='i', 'N'='n', 'S'='s', 'P'='p', 'i'='ic', 'n'='nc', *** *** 923,927 # ! if (m/^([INPS])\d+$/) { # a register. push @arg_t,lc($1); } elsif (m/^\[([a-z]+):(\d+)\s*\]$/) { # string constant --- 924,928 # ! if (m/^([EINPS])\d+$/) { # a register. push @arg_t,lc($1); } elsif (m/^\[([a-z]+):(\d+)\s*\]$/) { # string constant *** *** 945,949 # ! my @grep_ops = grep($_ =~ /^$opcode(?:_(?:(?:[ins]c?)|p))+$/, keys(%opcodes)); foreach my $op (@grep_ops) { --- 946,950 # ! my @grep_ops = grep($_ =~ /^$opcode(?:_(?:(?:[eins]c?)|p))+$/, keys(%opcodes)); foreach my $op (@grep_ops) { *** *** 1056,1059 --- 1057,1061 my %rtype_map = ( + e = E, i = I, n = N, *** *** 1092,1100 # ! if($rtype eq I || $rtype eq N || $rtype eq P || $rtype eq S) { # its a register argument ! $args[$_] =~ s/^[INPS](\d+)$/$1/i ! or error(Expected m/[INPS]\\d+/, but got '$args[$_]'!, $file, $line); error(Register $1 out of range (should be 0-31) in '$opcode',$file,$line) if $1 0 or $1 31; --- 1094,1102 # ! if($rtype eq E || $rtype eq I || $rtype eq N || $rtype eq P || $rtype eq S) { # its a register argument ! $args[$_] =~ s/^[EINPS](\d+)$/$1/i ! or error(Expected m/[EINPS]\\d+/, but got '$args[$_]'!, $file, $line); error(Register $1 out of range (should be 0-31) in '$opcode',$file,$line) if $1 0 or $1 31; Only in parrot/Parrot: Config.pm Only in parrot/Parrot: Types.pm diff --recursive -C 2 parrot_cvs/Types_pm.in parrot/Types_pm.in *** parrot_cvs/Types_pm.in Wed Oct 24 07:36:57 2001 --- parrot/Types_pm.in Wed Oct 24 07:39:58 2001 *** *** 35,38 --- 35,39 my %how_to_pack = ( + E = $pack_type{op}, I = $pack_type{op}, i = $pack_type{op}, Only in parrot/classes: intclass.o diff --recursive -C 2 parrot_cvs/config_h.in parrot/config_h.in *** parrot_cvs/config_h.in Wed Oct 24 07:36:57 2001 --- parrot/config_h.in Wed Oct 24 07:53:11 2001 *** *** 24,31 --- 24,33 #define FRAMES_PER_PMC_REG_CHUNK FRAMES_PER_CHUNK #define FRAMES_PER_NUM_REG_CHUNK FRAMES_PER_CHUNK + #define FRAMES_PER_EXC_REG_CHUNK FRAMES_PER_CHUNK #define FRAMES_PER_INT_REG_CHUNK FRAMES_PER_CHUNK #define FRAMES_PER_STR_REG_CHUNK FRAMES_PER_CHUNK #define MASK_STACK_CHUNK_LOW_BITS ${stacklow} + #define MASK_EXC_CHUNK_LOW_BITS ${intlow} #define MASK_INT_CHUNK_LOW_BITS ${intlow} #define MASK_NUM_CHUNK_LOW_BITS ${numlow} diff --recursive -C 2 parrot_cvs/core.ops parrot/core.ops *** parrot_cvs/core.ops Wed Oct 24 07:36:57 2001 --- parrot/core.ops Wed Oct 24 07:56:30 2001 *** *** 120,123 --- 120,127 + =item Bset(e, i) + + =item Bset(i, e) + =item Bset(i, i) *** *** 136,141 =cut ! AUTO_OP set(i, i|ic) { $1 = $2; } --- 140,148 =cut + AUTO_OP set(e, i) { + $1 = $2; + } ! AUTO_OP set(i, e|i|ic) { $1 = $2; } *** *** 684,688 AUTO_OP div(i, i|ic, i|ic) { ! $1 = $2 / $3; } --- 691,701 AUTO_OP div(i, i|ic, i|ic) { ! INTVAL z = $3; ! ! if(z == 0) { ! interpreter-exc_reg-registers[0] = 1; ! } else { ! $1 = $2 / $3; ! } } *** *** 1504,1507 --- 1517,1522 + =item Bpope() + =item Bpopi() *** *** 1517,1520 --- 1532,1539 =cut + AUTO_OP pope() { + Parrot_pop_e(interpreter); + } + AUTO_OP popi() { Parrot_pop_i(interpreter); *** *** 1536,1539 --- 1555,1560 + =item
[PATCHES] Exception idea
[Apologies if this is a repeat, but the last message was early Wed. and hasn't gone through yet] The promised patches (against Wednesday morning's CVS-latest) are attached to this message. [You might need to reverse the first patch, against MANIFEST] These patches add the following: a) Exception register stack (E0-E31 for the moment, will trim down to just E0) b) div_i_i_ic altered to raise an exception when ic==0 (Which is to say, sets E0 to 1) c) New instructions set_e_i, set_i_e, push_e, pop_e d) New test file t/op/exception.t, updated MANIFEST file The tests exercise the new instructions and validate that div_i_i_ic properly raises an exception. The patches are a -very- crude form of exception handling. The constants for errors like DIVIDE_BY_ZERO should probably be imported as manifest constants, but that change would have been beyond the scope of the patch :) Sample code that catches the divide-by-zero exception is in the t/op/exception.t test #4, but here's a better explanation (Code uses instructions that aren't implemented yet): pushe# Save the current exceptions set I2,5 div I1,I2,0 # This would ordinarily trigger a coredump. Not now. eq E0,DIVIDE_BY_ZERO,CATCH_EXCEPTION # Not in the current patch, but easy to add pope # Restore the exception stack Rather than implementing a static set of flags in some sort of exception register, each exception becomes an integer constant that can be tested against. This leaves plenty of expansion room ((2**31)-1 possible exceptions, assuming they're all negative) with the slight inconvenience of not being able to test for a bitwise-or of exception flags. I don't see this as being a major inconvenience, as most of the time you'll be testing for a specific exception, at least at the assembler level. The patch is incomplete, but then, so is the list of instructions that can raise exceptions. This way we have a mechanism in place to handle I/O exceptions when they're implemented (And I'm planning to work on instructions such as open_i_s, read_i, close_i over the weekend). For instance, an open I0,foo instruction (Just an idea, syntax will likely be very different) would be able to set constants such as FILE_NOT_FOUND and such. Since we're in assembler here, I'm not sure if a single instruction should throw multiple exceptions, and it probably shouldn't -anyway-. In that case, we could use E1-E31 for the others, but I feel that a single instruction should throw only one of a limited range of exceptions. For instance open_i_sc should only throw one of (FILE_NOT_FOUND, NO_PERMISSION, FILE_READ_ONLY, ...). I did consider using a bitfield of exceptions, but found it too limiting. Also, the only benefit I can see of doing this is being able to test for multiple exceptions at the same time. It isn't worth limiting the number of flags to 32 or whatever just to be able to handle this rare case. As usual, comments, criticisms, and questions more than welcome. -Jeff [EMAIL PROTECTED] [EMAIL PROTECTED] exception.diff exception.t
Re: [PATCHES] Exception idea
At 10:34 AM 10/25/2001 -0400, Jeffrey Goff wrote: pope # Restore the exception stack I've been thinking about going with an exception stack rather than a set of exception registers, but there's something awfully compelling about an opcode named pope... :) Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Quick todo list
Here's a list of what I'm going to try and get done really soon (like in the next day or so) *) Toss that stupid interpreter parameter. Going with thread-local storage instead. (And I know this is going to make Win32 unhappy) *) Split the generic stack into a temp stack and control stack *) Define parameter passing conventions *) Define the exception handling mechanism *) Simple open/read/write/close for files (Why yes, I do have a lot of good coffee and a tin of caffeinated pepermints. Why do you ask? :) Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: String rationale
In message [EMAIL PROTECTED] Dan Sugalski [EMAIL PROTECTED] wrote: =item type What the character set or type of data is encoded in the buffer. This includes things like ASCII, EBCDIC, Unicode, Chinese Traditional, Chinese Simplified, or Shift-JIS. (And yes, I know the latter's a combination of type and encoding. I'll update the doc as soon as I can reasonablty separate the two) Isn't this going to need to be a vtable pointer like encoding is? Only some things (like character classification and at least some transcoding tasks) will be character set based rather than encoding based. Other than that it looked quite good and I'll probably start looking at bending the existing code into the new model over the weekend. Tom -- Tom Hughes ([EMAIL PROTECTED]) http://www.compton.nu/
[PATCH] Making Win32 work
With the patch attached, all tests pass on Win32. Well, except for the fact that classes\intclass.obj gets created as .\intclass.obj, forcing you to manually copy it to the right place. Ugh. And examples\assembly\mops.obj has the same problem. And there are 11 warnings in intclass.c that I don't want to bother to fix. (classes\intclass.c(16) : warning C4716: 'Parrot_int_type' : must return a value and such.) Other than that, though, it works fine. --Brent Dax [EMAIL PROTECTED] Configure pumpking for Perl 6 When I take action, Im not going to fire a $2 million missile at a $10 empty tent and hit a camel in the butt. --Dubya --- ..\..\parrot-cvs\parrot\make_vtable_ops.pl Sun Oct 21 09:47:10 2001 +++ make_vtable_ops.pl Thu Oct 25 17:00:10 2001 @@ -1,35 +1,52 @@ use Parrot::Vtable; my %vtable = parse_vtable(); +print #define VTABLE_CALL_TYPE(func, type) ((op_func_t)((INTVAL)func + +(INTVAL)type))\n\n; + while (DATA) { next if /^#/ or /^$/; my @params = split; my $op = $params[1]; my $vtable_entry = $params[2] || $op; + die Can't find $vtable_entry in vtable, line $.\n unless exists $vtable{$vtable_entry}; + print AUTO_OP $params[1] (.(join , , (p)x$params[0]).) {\n; -print \t(\$2-vtable-$vtable_entry; -print multimethod($vtable_entry); + +print \t.multimethod($vtable_entry); + if ($params[0] == 3) { # Three-address function -print ')($2,$3,$1);'; +print '($2,$3,$1);'; } elsif ($params[0] == 2) { # Unary function -print ')($2,$1);'; +print '($2,$1);'; } + print \n}\n; } + sub multimethod { -my $type = $vtable{$_[0]}{meth_type}; -returnif $type eq unique; -return '_1 + $3-vtable-num_type' if $type eq num; -return '_1 + $3-vtable-string_type' if $type eq str; +my $vtable_entry=shift; +my $type = $vtable{$vtable_entry}{meth_type}; +my $firstarg=\$2-vtable-$vtable_entry; + +return (${firstarg}) + if $type eq unique; + +return VTABLE_CALL_TYPE(${firstarg}_1, \$3-vtable-num_type) + if $type eq num; + +return VTABLE_CALL_TYPE(${firstarg}_1, \$3-vtable-string_type) + if $type eq str; + die Coding error - undefined type $type\n; } + __DATA__ # Three-address functions 3 add --- ..\..\parrot-cvs\parrot\core.opsWed Oct 24 07:54:54 2001 +++ core.opsThu Oct 25 14:27:46 2001 @@ -3,8 +3,16 @@ */ #include math.h -#include sys/time.h +#ifdef HAS_HEADER_SYSTIME + #include sys/time.h +#else + #ifdef WIN32 +#include time.h +__declspec(dllimport) void __stdcall Sleep(unsigned long); + #endif /* WIN32 */ +#endif /* HAS_HEADER_SYSTIME */ + =head1 NAME core.ops @@ -95,9 +103,19 @@ =cut AUTO_OP time(n) { +#ifdef HAS_HEADER_SYSTIME + struct timeval t; gettimeofday(t, NULL); $1 = (FLOATVAL)t.tv_sec + ((FLOATVAL)t.tv_usec / 100.0); + +#else + + /* Win32 doesn't have gettimeofday or sys/time.h, so just use normal time w/o +microseconds + XXX Is there a Win32 equivalent to gettimeoday? */ + $1 = (FLOATVAL)time(NULL); + +#endif } @@ -1786,7 +1804,11 @@ =cut AUTO_OP sleep(i|ic) { - sleep($1); + #ifdef WIN32 +Sleep($1*1000); + #else +sleep($1); + #endif } ###
Re: [PATCHES] Exception idea
Yeah, I probably should have named the register stack 'X' or something like that. At least we're thinking along somewhat compatible lines. I'll be eager to see your solution... Dan Sugalski wrote: At 10:34 AM 10/25/2001 -0400, Jeffrey Goff wrote: pope # Restore the exception stack I've been thinking about going with an exception stack rather than a set of exception registers, but there's something awfully compelling about an opcode named pope... :) Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: String rationale
At 11:59 PM 10/25/2001 +0100, Tom Hughes wrote: In message [EMAIL PROTECTED] Dan Sugalski [EMAIL PROTECTED] wrote: =item type What the character set or type of data is encoded in the buffer. This includes things like ASCII, EBCDIC, Unicode, Chinese Traditional, Chinese Simplified, or Shift-JIS. (And yes, I know the latter's a combination of type and encoding. I'll update the doc as soon as I can reasonablty separate the two) Isn't this going to need to be a vtable pointer like encoding is? Yup. I'd intended it to be an index into a table of character set functions. Jarkko has convinced me that it's better to have it as a vtable pointer, but I haven't had a chance to update the docs yet. Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Ooops, sorry for that blank log message.
Darn it, I fat fingered the log message. This is a fix which changes the way op variants are handled. The old method forgot the last variant, so thing(i,i|ic,i|ic) would generate: thing(i,i,i) thing(i,i,ic) thing(i,ic,i) but not thing(i,ic,ic) The new one does. Brian