A few fixed japhs
I've fixed a few of the japhs, 3-7. I didn't leave japh7.pasm obfuscated any more than a japh should be. japh3.pasm Description: Binary data japh4.pasm Description: Binary data japh5.pasm Description: Binary data japh6.pasm Description: Binary data japh7.pasm Description: Binary data
Re: Three more shoot outs
On Thu, Dec 15, 2005 at 11:15:20PM -0600, Joshua Isom wrote: > I noticed a slight glitch with the regex-dna benchmark. The benchmark > spec says to account for case insensitivity. So I added the :i > modifier to the patterns and just stuck to the p6 rules. But using the > :i modifier makes it take over three times as long. I'm not too surprised that :i slows things down -- afaik Parrot doesn't have a case-insensitive string compare, so PGE downcases the strings for comparison. It's currently doing this at each comparison; it would probably be quicker to just make a copy of the downcased string and use that for comparison. Pm
Re: Three more shoot outs
I noticed a slight glitch with the regex-dna benchmark. The benchmark spec says to account for case insensitivity. So I added the :i modifier to the patterns and just stuck to the p6 rules. But using the :i modifier makes it take over three times as long. Although for the example and the full benchmark, the case sensitivity seems to be irrelevant, I've modified it to strip add the :i and striped out the p5 rules. It also does take a fair chunk of memory. Also, reading in the benchmark full data file takes a while doing it line by line, 83337 lines. regexdna.pir Description: Binary data
Three more shoot outs
I just finished three more shoot outs. Two are rather simple, a floating point version of ack, and another that reads from stdin and adds together the numbers on the lines. The third, is regex-dna. It cheats a little, since as far as I know PGE doesn't have any regex based substitutions even though they're not needed for this benchmark, well aside from the fact that it's supposed to strip newlines and fasta headers by a substitution. Anyway, the floating point takfp is slow, 364 seconds for me, which makes it really really slow. The python's the fasted of perl, tcl, python, ruby, and php. Parrot's not too well. Anyway, for the regex-dna, I used PGE's perl 5 rules. The commented out sections use the perl6 rules, but then you'll need to comment out the perl 5 areas... It's taking 25 seconds for me, for the test file, not the benchmark. Perl finishes the same file in .6 seconds. It reads in line by line which slows it down, but it makes stripping the unwanted things much easier(unless someone can tell me how to work in a regex to strip it, to truly comply). Only 8 languages are included in the benchmark though, but parrot can be added... The DOD is taking the most time, I get it done in 6.36 seconds(and a lot of memory), if I put `sweepoff` in the file, but it does take up a large amount of ram. takfp.pir Description: Binary data sumcol.pir Description: Binary data regexdna.pir Description: Binary data
Re: Transliteration preferring longest match
On Thu, Dec 15, 2005 at 09:56:09PM +, Luke Palmer wrote: > On 12/15/05, Brad Bowman <[EMAIL PROTECTED]> wrote: > > Why does the longest input sequence win? > >Is it for some consistency that that I'm not seeing? Some exceedingly > > common use case? The rule seems unnecessarily restrictive. > > Hmm. Good point. You see, the longest token wins because that's an > exceedingly common rule in lexers, and you can't sort regular > expressions the way you can sort strings, so there needs to be special > machinery in there. > > There are two rather weak arguments to keep the longest token rule: > > * We could compile the transliteration into a DFA and make it > fast. Premature optimization. > * We could generalize transliteration to work on rules as well. > > In fact, I think the first Perl module I ever wrote was > Regexp::Subst::Parallel, which did precisely the second of these. > That's one of the easy things that was hard in Perl (but I guess > that's what CPAN is for). Hmm.. none of these is really a compelling > argument either way. If a shorter rule is allowed to match first, then the longer rule can be removed from the match set, at least for constant string matches. If, for example, '=' can match without preferring to try first for '==' then you'll never match '==' without syntactic help to force a backtracking retry. --
harmonic test program for shootout (attached)
This one is really trivial, but I'm not complaining. =head1 NAME examples/shootout/harmonic.pir - Partial sum of Harmonic series =head1 SYNOPSIS % ./parrot examples/shootout/harmonic.pir 1000 =head1 DESCRIPTION Translated from C code by Greg Buchholz into PIR by Peter Baylies <[EMAIL PROTECTED]>. The C code is: /* The Great Computer Language Shootout http://shootout.alioth.debian.org/ contributed by Greg Buchholz Optimized by Paul Hsieh compile: gcc -O2 -o harmonic harmonic.c */ #include #include int main (int argc, char **argv) { double i=1, sum=0; int n; for(n = atoi(argv[1]); n > 0; n--, i++) sum += 1/i; printf("%.9f\n", sum); return 0; } =cut .sub 'main' :main .param pmc argv .local int argc .local int n .local num i, sum i = 1 sum = 0 argc = argv if argc <= 1 goto NREDO $S0 = argv[1] n = $S0 NREDO: $N1 = 1 / i sum += $N1 inc i dec n if n goto NREDO PRINT: $P0 = new .FixedFloatArray $P0 = 1 $P0[0] = sum $S0 = sprintf "%.9f\n", $P0 print $S0 end .end
mandelbrot test program for shootout (attached)
The mandelbrot benchmark looked like it'd be an easy one to implement, and lo and behold, it was! I haven't optimized this at all really, but it seems to run fairly quickly anyhow. -- Peter Baylies =head1 NAME examples/shootout/mandelbrot.pir - Print the Mandelbrot set =head1 SYNOPSIS % ./parrot examples/shootout/mandelbrot.pir 200 > out.pbm =head1 DESCRIPTION This outputs a pbm file of the Mandelbrot set. Defaults to 200x200. Translated from C code by Greg Buchholz into PIR by Peter Baylies <[EMAIL PROTECTED]>. The C code is: /* The Great Computer Language Shootout http://shootout.alioth.debian.org/ contributed by Greg Buchholz compile: gcc -O2 -o mandelbrot mandelbrot.c run: mandelbrot 200 > out.pbm */ #include int main (int argc, char **argv) { int w, h, x, y, bit_num = 0; char byte_acc = 0; int i, iter = 50; double limit = 2.0; double Zr, Zi, Cr, Ci, Tr, Ti; w = atoi(argv[1]); h = w; printf("P4\n%d %d\n",w,h); for(y=0;y limit*limit) break; } if(Zr*Zr+Zi*Zi > limit*limit) byte_acc = (byte_acc << 1) | 0x00; else byte_acc = (byte_acc << 1) | 0x01; bit_num++; if(bit_num == 8) { putc(byte_acc,stdout); byte_acc = 0; bit_num = 0; } else if(x == w-1) { byte_acc = byte_acc << (8-w%8); putc(byte_acc,stdout); byte_acc = 0; bit_num = 0; } } } return(0); } =cut .sub 'main' :main .param pmc argv #int w, h, x, y, bit_num = 0; #char byte_acc = 0; #int i, iter = 50; #double limit = 2.0; #double Zr, Zi, Cr, Ci, Tr, Ti; .local intw, h, x, y, bit_num, byte_acc, i, iter .local numlimit, Zr, Zi, Cr, Ci, Tr, Ti .sym int argc bit_num = 0 byte_acc = 0 iter = 50 limit = 2.0 # slight optimization here -- nothing a decent C compiler wouldn't already do :) limit = limit * limit argc = argv w = 200 if argc <= 1 goto noarg # w = atoi(argv[1]); $S0 = argv[1] w = $S0 # h = w noarg: h = w # printf("P4\n%d %d\n",w,h); print "P4\n" print w print " " print h print "\n" y = 0 YREDO: x = 0 XREDO: # Zr = 0.0; Zi = 0.0; Zr = 0.0 Zi = 0.0 # Cr = (2*(double)x/w - 1.5); Cr = x Cr /= w Cr *= 2 Cr -= 1.5 # Ci=(2*(double)y/h - 1); Ci = y Ci /= h Ci *= 2 Ci -= 1 i = 0 IREDO: # Tr = Zr*Zr - Zi*Zi + Cr; $N1 = Zr * Zr $N2 = Zi * Zi Tr = $N1 - $N2 Tr += Cr # Ti = 2*Zr*Zi + Ci; Ti = 2 Ti *= Zr Ti *= Zi Ti += Ci # Zr = Tr; Zi = Ti; Zr = Tr Zi = Ti # if (Zr*Zr+Zi*Zi > limit*limit) break; # $N1 = Zr * Zr # $N2 = Zi * Zi $N1 += $N2 if $N1 > limit goto IBRK inc i if i < iter goto IREDO IBRK: byte_acc <<= 1 if $N1 <= limit goto SLA byte_acc |= 0 goto SLE SLA: byte_acc |= 1 SLE: inc bit_num if bit_num != 8 goto NTST1 PRINT: chr $S1, byte_acc print $S1 byte_acc = 0 bit_num = 0 goto NTSTE NTST1: $I1 = w dec $I1 goto NTSTE if x != $I1 goto NTSTE $I1 = w $I1 %= 8 $I1 = 8 - $I1 byte_acc <<= $I1 goto PRINT NTSTE: inc x if x < w goto XREDO inc y if y < h goto YREDO end .end
Re: Transliteration preferring longest match
On Thu, Dec 15, 2005 at 06:50:19PM +0100, Brad Bowman wrote: : : Hi, : : S05 describes an array version of trans for transliteration: : ( http://dev.perl.org/perl6/doc/design/syn/S05.html#Transliteration ) : : The array version can map one-or-more characters to one-or-more : characters: : : $str.=trans( [' ', '<','>','&'] => : [' ', '<', '>', '&' ]); : : In the case that more than one sequence of input characters matches, : the longest one wins. In the case of two identical sequences the first : in order wins. : : Why does the longest input sequence win? : Is it for some consistency that that I'm not seeing? Some exceedingly : common use case? The rule seems unnecessarily restrictive. On the contrary, it frees the user from having to worry about the order. : The "first in order" rule is more flexible, the user can sort their : arrays to produce the longest input rule, or use another order if that is : preferred. What possible use is a user-ordered rule set? If you put the shorter entry first, the longer one can never be reached. It's not like you can backtrack into a transliteration and pick a different entry. : The first transliteration example even uses sort in : the pair-wise form: : : $str.trans( %mapping.pairs.sort ); That seems like a useless use of sort, and probably defeats the optimizer as well. : Can we drop the longest preference? Doesn't hurt anything, and can probably help. Plus we already have the longest token rule in there for magical hash matching in rules, so it's likely the optimizer will already know how to handle it, or something like it. Larry
Re: relational data models and Perl 6
Ruud H.G. van Tol schreef: > [RD-interface] See also these Haskell Hierarchical Libraries (base package) http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Set.html http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Map.html -- Affijn, Ruud "Gewoon is een tijger."
Re: Transliteration preferring longest match
On 12/15/05, Brad Bowman <[EMAIL PROTECTED]> wrote: > Why does the longest input sequence win? >Is it for some consistency that that I'm not seeing? Some exceedingly > common use case? The rule seems unnecessarily restrictive. Hmm. Good point. You see, the longest token wins because that's an exceedingly common rule in lexers, and you can't sort regular expressions the way you can sort strings, so there needs to be special machinery in there. There are two rather weak arguments to keep the longest token rule: * We could compile the transliteration into a DFA and make it fast. Premature optimization. * We could generalize transliteration to work on rules as well. In fact, I think the first Perl module I ever wrote was Regexp::Subst::Parallel, which did precisely the second of these. That's one of the easy things that was hard in Perl (but I guess that's what CPAN is for). Hmm.. none of these is really a compelling argument either way. Luke
Test::Harness spitting an error
I'm puzzled. I have a number of tests in a distribution. The test reside in the /t subdirectory. When I run those test from a command line, thusly: prove t/*.t All tests pass just fine. No errors of any kind are spit out. I'm still working on the tests and they all currently use Test::More with "no_plan". I also have Test::Pod and Test::Pod::Coverage tests. I wrote a test harness - you'll find the code below my signature, if you're interested. When I run it, I get the following: You said to run 0 tests! You've got to run something. # Looks like your test died before it could output anything. Then the tests run and the success/failure report for each module returns success. Anyone have any idea where this error is coming from? I'm assuming it's Test::Harness, but I'm not sure what I'm doing wrong. I've done some Googling, but I've so far found nothing useful. I see others are getting that error but I'm not seeing it in the same context in which I'm getting it. If it matters, I'm running ActiveState 5.8.7 on Windows XP. Regards, Troy Denkinger --- included code --- #!/usr/bin/perl use strict; use warnings; use File::Find; use Cwd; use Text::Reform qw( form ); use Data::Dumper; use Getopt::Long; use Pod::Usage; use Test::Harness::Straps; my $strap = Test::Harness::Straps->new(); our( @status, @filename, @expected, @run, @passed, @skipped, @todo, @todo_skipped ); our( $opt_help, $opt_path, $opt_email, $opt_noprint ); GetOptions( "help|h"=> \$opt_help, "path|p=s" => \$opt_path, "email|e=s" => \$opt_email, "noprint|n" => \$opt_noprint, ); pod2usage( -verbose => 2 ) if $opt_help; my $path = getcwd unless $opt_path; find( \&test_found, $path ); my $report_date = localtime( time ); my $alert = form '', "Test suite at $path", "Report Date: $report_date", '', ' ', ' Tests ', ' ', 'Status Filename ExpectedRun Passed Skipped TODO TODO Skipped', '--', ' [[ ||| |||||| ||| ||| |||', [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]; print $alert unless $opt_noprint; send_mail( $alert, "Test Suite: $path", $opt_email ) if $opt_email; sub send_mail{ my ( $alert, $subject, $to_email ) = @_; # Send email here } sub test_found{ return unless m/\.t$/; return if $File::Find::name =~ m/exclude_tests/i; my %results = $strap->analyze_file( $File::Find::name ); my $status = 'FAIL'; $status = ' OK ' if $results{'max'} == $results{'seen'} && $results{'seen'} == $results{'ok'}; my ( $file ) = $File::Find::name; $file =~ s/$path//;; push @status, $status; push @filename, $file; push @expected, $results{'max'}; push @run, $results{'seen'}; push @passed, $results{'ok'}; push @skipped, $results{'skip'}; push @todo, $results{'todo'}; push @todo_skipped, $results{'bonus'}; } 1;
Re: relational data models and Perl 6
[snip entire conversation so far] (Please bear with me - I'm going to go in random directions.) Someone please correct me if I'm wrong, but it seems that there's only a few things missing in P6: 1) An elegant way of creating a tuple-type (the "table", so to speak) 2) A way of providing constraints across the actual tuples of a given tuple-type 3) Syntactic sugar for performing the relational calculus To me, a tuple-type is more than a class in the standard OO. It has to be able to apply any constraints that might be upon the tuple-type, such as uniqueness of a given element across all tuples or foreign-key constraints. While this is certainly possible using the P6 OO constructs, it would make sense for a baseclass to provide this functionality. Actually, this is a really great place for metaclasses to shine. The actual tuple-type needs to be constructed from some class-constructor (which would be, in the metamodel, itself a class). This is so that it has the appropriate types for the elements of the tuple along with any necessary constraints upon the tuples / elements of the tuples. In addition, you're going to want to take actions not just on the tuple, but on the entire tuple-type. That screams class-level methods that operate across all instances of the class. Maybe, a set of roles would be good for organizing this kind of across-all-instances behavior that the tuple-type can take advantage of. I'm sure that this wouldn't be limited to just the relational calculus. As for the syntactic sugar, I'm not quite sure what should be done here. And, with macros, it's not clear that there needs to be an authoritative answer. Personally, I'd simply overload + for union, - for difference, * for cross-product, / for divide, and so forth. There's been some discussion with sets as to creating new operators using the set-operators that come in Unicode. As tuples and relations among tuples aren't necessarily sets, those might not be appropriate. It also seems clear that junctionish iterators may be of use here. For example, "Give me all the tuples that match this criteria" might return an iterator that also acts as an any-junction. It could also return a class object that has a different set of instances marked as created from it. Though, I'm not too sure how that would work when asking a given instance "who is the class object that created you?" ... maybe it returns the initial one or maybe it returns them all? I think the initial one is more correct, as the others are just subsets. When dealing with SQL, I don't care about the subsets that a given row belongs to - I only care about the table. So, maybe the subset class objects delegate all methods to the original class object except for those that deal with "Who do you have?" and "Give me a subset where ..." Also, joins between tuple-types would have to create a new tuple-type, with the tuples within being delegators to the underlying tuples? I'm not sure that this (or any other) derived tuple-type class object should be allowed to create new tuples (though I'm sure someone can think of a good reason why I'm wrong). Again, just a bunch of meandering thoughts. Bonus points to whomever can help me bridge the gap between what I just blathered and an elegant solution to Sudoku. Rob
Re: [perl #37947] Patch to give yet more output on smoke
On Wednesday 14 December 2005 12:09, Alberto Simoes wrote: > Basically, count tests, count tests ok, give rate. Useful if you want to > run smoke and look to the output just at the end. Thanks, applied with sprintf() tweaks as #10538. Perhaps the maintainer of Test::TAP::Model should look at this patch too. -- c
Re: relational data models and Perl 6
Darren Duncan wrote: As an addendum to what I said before ... ... I would want the set operations for tuples to be like that, but the example code that Luke and I expressed already, with maps and greps etc, seems to smack too much of telling Perl how to do the job. I don't want to have to use maps or greps or whatever, to express the various relational operations. I think you're reading too many semantics into C and C: they don't tell perl *how* to implement the search, any more than C would. The example was: INSERT INTO NEWREL SELECT FROM EMP WHERE DNO = 'D2'; Vs my $NEWREL = $EMP.grep:{ $.DNO eq 'D2' }; The implementation of $EMP.grep depends very much on the class of $EMP. If this is an array-ref, then it is reasonable to think that the grep method would iterate the array in-order. However, if the class is "unordered set", then there is no such expectation on the implementation. The deeper problem is probably the use of the "eq" operator in the test. Without knowing a-priori what operations (greps) will be performed on the relation, it is not possible to optimize the data structure for those specific operations. For example, if we knew that $EMP should store its data based on the {$.DNO eq 'D2'} equivalence class then this grep would have high performance (possibly at the expense of its creation). In theory, a sufficiently magical module could examine the parse tree (post type-inference), and find all the calls to C on everything that's a tuple -- and use that to attempt optimizations of a few special cases (e.g. a code block that contains just an "eq" test against an attribute). I'm not sure how practical this would be, but I don't see how a different syntax (e.g. s/grep/where/) would be more more declarative in a way that makes this task any easier.
Re: relational data models and Perl 6
Darren Duncan schreef: > If you take ... > > +-+-+ > |a|x| > |a|y| > |a|z| > |b|x| > |c|y| > +-+-+ > > ... and divide it by ... > > +-+ > |x| > |z| > +-+ > > ... the result is ... > > +-+ > |a| > +-+ > > I'm not sure if Divide has an equivalent in SQL. A verbose way to do it: SELECTC_abc FROM T_abc_xyz NATURAL INNER JOIN T_xz GROUP BY C_abc HAVINGCount(T_abc_xyz.C_xyz) =(SELECT Count(*) FROM T_xz); This basically filters the INNER JOIN result-set to only keep those subsets that have the required number of rows. It requires that the rows of each table are unique, so there can not be another (b,x) in T_abc_xyz. That is a normal requirement. -- Grtz, Ruud
Transliteration preferring longest match
Hi, S05 describes an array version of trans for transliteration: ( http://dev.perl.org/perl6/doc/design/syn/S05.html#Transliteration ) The array version can map one-or-more characters to one-or-more characters: $str.=trans( [' ', '<','>','&'] => [' ', '<', '>', '&' ]); In the case that more than one sequence of input characters matches, the longest one wins. In the case of two identical sequences the first in order wins. Why does the longest input sequence win? Is it for some consistency that that I'm not seeing? Some exceedingly common use case? The rule seems unnecessarily restrictive. The "first in order" rule is more flexible, the user can sort their arrays to produce the longest input rule, or use another order if that is preferred. The first transliteration example even uses sort in the pair-wise form: $str.trans( %mapping.pairs.sort ); Can we drop the longest preference? Brad -- By inconsistency and frivolity we stray from the Way and show ourselves to be beginners. In this we do much harm. -- Hagakure http://bereft.net/hagakure/
Re: Variables, Aliasing, and Undefined-ness
Leopold Toetsch <[EMAIL PROTECTED]> wrote: > Matt Diephouse wrote: > > > $alias = undef > > > > translates to > > > > null $P1 > > $P2 = getinterp > > $P2 = $P2["lexpad"; 1] > > $P2['$alias'] = $P1 > > Given that you are using DynLexPad, you just do: > >delete $P2['alias'] If only it were that simple. A delete operation like this will break the aliasing. And only $alias will be undefined. $foo must also be undefined. And if either becomes defined again, they must still point to the same PMC. Also, delete_keyed() isn't implemented in DynLexPad.. -- matt diephouse http://matt.diephouse.com
Re: More shootout, two randoms - c's static
Joshua Isom wrote: On Dec 12, 2005, at 4:47 PM, Leopold Toetsch wrote: Well, we dont't have a C-like static construct. Today I remembered something I read about how pir handles pasm registers, "PASM registers keep their register. Yes, but not across function calls. I've a version here that uses a closure to achieve that effect, but that's slower than the current global solution. leo
Re: More shootout, two randoms - c's static
On Dec 12, 2005, at 4:47 PM, Leopold Toetsch wrote: Well, we dont't have a C-like static construct. Today I remembered something I read about how pir handles pasm registers, "PASM registers keep their register. During the usage of a PASM register this register will be not get assigned to."(docs/imcc/operation.pod) I modified the random.pir code to use a pasm register and hoped it'd behave as I'm interpreting it. First I tried an .emit section without results, but figured it never was executed so put the initialization to 42 into the sub, and it still wasn't working. The operation file gives the impression that parrot won't reset and pasm register and essentially leave it for the programmer to deal with. But from what I'm seeing, parrot's treating a pasm register as if it were a pir temporary register. Is there any clarification of this?
Re: Variables, Aliasing, and Undefined-ness
Matt Diephouse wrote: > So what am I supposed to do? It appears that using `null` to mark > deleted/undefined variables won't work. But it's not clear to me that > using a Null PMC is a good idea... Here's one possibility: you can use one of the PObj_private PMC flags to store the defined/undefined status of each variable. Then override the 'defined' vtable entry to examine this bit. Then use the 'defined' opcode to test the defined-ness of any TCL PMC. Or, your other vtable entries can read this bit themselves and you may not need to test it from the PIR. It seems to me that's what the 'defined' opcode is supposed to be for. But I haven't done it this way myself so I might be talking through my hat here. Regards, Roger Browne
Re: Variables, Aliasing, and Undefined-ness
Matt Diephouse wrote: $alias = undef translates to null $P1 $P2 = getinterp $P2 = $P2["lexpad"; 1] $P2['$alias'] = $P1 Given that you are using DynLexPad, you just do: delete $P2['alias'] HTH leo
Re: relational data models and Perl 6
On Dec 15, 2005, at 2:19, Darren Duncan wrote: * a Tuple is an associative array having one or more Attributes, and each Attribute has a name or ordinal position and it is typed according to a Domain; this is like a restricted Hash in a way, where each key has a specific type * a Relation is an unordered set of Tuples, where every Tuple has the same definition, as if the Relation were akin to a specific Perl class and every Tuple in it were akin to a Perl object of that class Something that puzzled me in "Database in Depth" is that jargon, supposedly math-based. A relation in math is just a subset of a Cartesian product, and a tuple is an element of a relation. So it's standard for a Relation type to be a set of Tuples, but a tuple itself is not a set (as are "tuples" in the book, argh). So if something unordered like that goes into the language to mimick that model I wouldn't call it Tuple. Math conventions there are well established, the jargon in "Database in Depth" departs from them and I don't think it is a good idea to adopt it. -- fxn
[perl #37951] Bad permissions on docs/ops/*
# New Ticket Created by Joshua Isom # Please include the string: [perl #37951] # in the subject line of all future correspondence about this issue. # https://rt.perl.org/rt3/Ticket/Display.html?id=37951 > For all the files in docs/ops, the permissions are set to 600. If parrot's installed by root, these become unreadable by anyone but root.
Variables, Aliasing, and Undefined-ness
While working out some bugs in ParTcl I came across something roughly equivalent to the following Perl code (I'm using Perl because I believe more people know Perl than Tcl, at least on this list): #!/usr/bin/perl $var = "Foo"; *alias = *var; $alias = undef; $alias = "Baz"; print $var, "\n"; And I'm stuck wondering how I'm supposed to implement that in PIR. Or at least what the best way is to implement that in PIR. Currently, ParTcl works this way: $alias = undef translates to null $P1 $P2 = getinterp $P2 = $P2["lexpad"; 1] $P2['$alias'] = $P1 That is, we null variables when want them to appear almost as if they'd never existed. (Almost because aliases still work.) Tcl is a bit stricter than Perl, so any time we try to read an undefined value we get an error. So print $alias translates to something like $P0 = find_lexical '$alias' if null $P0 goto error "&print"($P0) end error: print "Undefined variable '$alias'" end where the error will be given if $alias was never assigned a value or if we assign undef to $alias (`$alias = undef`). Normal assignment in Tcl looks like this: $P0 = find_lexical '$alias' assign $P0, new_value We use `assign` to preserve any aliases (so that $foo == new_value, in other words). However, if $alias is undefined (or `null`, in PIR-speak), then that assignment fails with a "Null PMC access" error. So what am I supposed to do? It appears that using `null` to mark deleted/undefined variables won't work. But it's not clear to me that using a Null PMC is a good idea (then we must perform `'isa` tests on every read to see if that variable is undefined, which seems like it would be expensive). So what's the "correct" way to do this? -- matt diephouse http://matt.diephouse.com
Re: relational data models and Perl 6
As an addendum to what I said before ... The general kind of thing I am proposing for Perl 6 to have is a declarative syntax for more kinds of tasks, where you can simply specify *what* you want to happen, and you don't have to tell Perl how to perform that task. An example of declaratives that is already specified is hyper-operators; you don't have to tell Perl how to iterate through various lists or divide up tasks. I would want the set operations for tuples to be like that, but the example code that Luke and I expressed already, with maps and greps etc, seems to smack too much of telling Perl how to do the job. I don't want to have to use maps or greps or whatever, to express the various relational operations. -- Darren Duncan
Re: relational data models and Perl 6
At 2:54 AM + 12/15/05, Luke Palmer wrote: On 12/15/05, Darren Duncan <[EMAIL PROTECTED]> wrote: I propose, perhaps redundantly, that Perl 6 include a complete set of native Okay, I'm with you here. Just please stop saying "native" and "core". Everyone. Yes, of course. What I meant was that I considered relational data important enough for common programming to be considered by the Perl 6 language designers, so that the language allows for it to be elegantly represented and processed. The implementation details aren't that important. I would like to hear from Ovid and Dave Rolsky on this issue too, as they seem to have been researching pure relational models. As am I now. My own database access framework in development is evolving to be centered more around an ideal relational model rather than simply what SQL or existing databases define. It does any serious database developer good to be familiar with what the relational model actually says, and not just what tangential things have actually been implemented by various vendors. The sources I cited are good reference and/or explanatory materials. > Essentially it comes down to better handling of data sets. Cool. I've recently been taken by list comprehensions, and I keep seeing "set comprehensions" in my math classes. Maybe we can steal some similar notation. You probably could; the terms used in relational theory are mostly or entirely from mathematics. (I could stand to learn more about those maths too.) Hmm. I would say it's a hash not so much. For instance, the difference between an array and a tuple in many languages is that an array is homogeneously-typed--that's what allows you to access it using runtime values (integers). Tuples are heterogeneously-typed, so you can't say my $idx = get_input(); say $tuple[$idx]; (Pretend that Perl 6 is some other language :-), because the compiler can't know what type it's going to say. In the same way, I see a hash as homogeneously-typed, because you can index it by strings. What you're referring to as a tuple here would be called a "record" or a "struct" in most languages. Yes, you are right; a Tuple is very much a "record" or a "struct"; I just didn't use those because Perl doesn't have them per se; the closest thing that Perl has is the "object", which you could say is exactly equivalent. > * a Relation is an unordered set of Tuples, where every Tuple has the same definition, as if the Relation were akin to a specific Perl class and every Tuple in it were akin to a Perl object of that class When you say "unordered set" (redundantly, of course), can this set be infinite? That is, can I consider this relation (using made-up set comprehension notation): { ($x,$y) where $x & $y (in) Int, $x <= $y } And do stuff with it? Yes you can. A set can be infinite. For example, the set of INTEGER contains every whole number from negative infinity to positive infinity. At the same time, this set excludes all fractional numbers and all data that is not a number, such as characters. This only becomes finite when you place bounds on the range, such as saying it has to be between +/- 2 billion. > Specifically what I would like to see added to Perl, if that doesn't already exist, is a set of operators that work on Relations, like set operations, such as these (these bulleted definitions from "Database in Depth", 1.3.3, some context excluded): * Restrict - Returns a relation containing all tuples from a specified relation that satisfy a specified condition. For example, we might restrict relation EMP to just the tuples where the DNO value is D2. Well, if we consider a relation to be a set, then we can use the set operations: my $newrel = $emp.grep: { .DNO === 'D2' }; I don't know what EMP, DNO, and D2 are... Part of the context I excluded before, from section 1.3.1, is that the author is talking about hypothetical DEPT (Department) and EMP (Employee) relations (tables); DEPT has the attributes [DNO, DNAME, BUDGET], and EMP has the attributes [ENO, ENAME, DNO, SALARY]; DEPT.DNO is referenced by EMP.DNO; DEPT.DNO and EMP.ENO are primary keys in their respective relations. So the restrict example is like, as you said, but with EMP an object: my $NEWREL = $EMP.grep:{ $.DNO eq 'D2' }; A SQLish equivalent would be: INSERT INTO NEWREL SELECT FROM EMP WHERE DNO = 'D2'; > * Project - Returns a relation containing all (sub)tuples that remain in a specified relation after specified attributes have been removed. For example, we might project relation EMP on just the ENO and SALARY attributes. Hmm... Well, if we pretend that records and hashes are the same thing for the moment, then: my $newrel = $emp.map: { .: }; (See the new S06 for a description of the .: syntax) Or with EMP an object: my $NEWREL = $EMP.map:{ $_.class.new( ENO => $_.ENO, SALARY => $.SALARY ) }; SQLish: INSERT INT