Re: Auto My?
Luke Palmer wrote: James Mastros writes: Does this imply that it's now possible to type Cmy @foo[23] = 42;, and declare @foo? In the current perl, this doesn't work -- it's a syntax error. It'd certainly make many constructs easier. That looks weird to me. But as Rod points out, it can be useful with hashes. Yes, that's the primary case I was thinking of. I was trying to find a smaller example. OTOH, I realize now you can do that with zip in P6, in which case you do have a mention of the whole variable to stick a my on -- Cmy %foo = zip(@keys, @values); I think Cmy [EMAIL PROTECTED] = @values; reads better though, even though looking at it literally, you're attempting to lexicalize an element. -=- James Mastros, theorbtwo
Re: Auto My?
James Mastros writes: Luke Palmer wrote: James Mastros writes: Does this imply that it's now possible to type Cmy @foo[23] = 42;, and declare @foo? In the current perl, this doesn't work -- it's a syntax error. It'd certainly make many constructs easier. That looks weird to me. But as Rod points out, it can be useful with hashes. Yes, that's the primary case I was thinking of. I was trying to find a smaller example. OTOH, I realize now you can do that with zip in P6, in which case you do have a mention of the whole variable to stick a my on -- Cmy %foo = zip(@keys, @values); I think Cmy [EMAIL PROTECTED] = @values; reads better though, even though looking at it literally, you're attempting to lexicalize an element. Know what's cool? my %foo = @keys = @values; I think we have enough WTDI now. Luke
Re: Auto My?
On 2004-12-19 at 21:35:46, Luke Palmer wrote: In Perl 5 you can do the hackish: (\my @foo)-[23] = 42; Hm. My reaction to the above is, and I think I speak for the entire assemblage when I say this, Yuckbo. :) Now, (my @foo)[23] would be somewhat better, but of course, that's attempting to assign to an element of a nonce list, not an array. I think it would be reasonable for { my @foo[23] = 42; } to be legal Perl 6 that declares @foo as lexical. Letting { my $foo[23] = 42; } work in Perl 5 would be weirder, but Perl6's arrays always have @ means that it's pretty clear you're declaring an array rather than a single element. Just my 2c. -Mark
Re: Auto My?
James Mastros skribis 2004-12-19 23:00 (+0100): Juerd wrote: Just typing my before the first use of a variable isn't hard, and it makes things much clearer for both the programmer and the machine. Does this imply that it's now possible to type Cmy @foo[23] = 42;, and declare @foo? In the current perl, this doesn't work -- it's a syntax error. It'd certainly make many constructs easier. I didn't mean to imply that, but I sure think it's a great idea, even though I can't think of useful examples (as everything I can come up with is more elegantly done with vector ops or map anyway). Juerd
spaces for alignement
On Sun, Dec 19, 2004 at 06:44:33PM -0800, chromatic wrote: On Sun, 2004-12-19 at 20:25 -0600, Rod Adams wrote: [snipped] $x = 4; $y = 7; $z = 12; $r = 4543; $q = 121; With a fixed width font, like all code editors use, all the =' like up, and I can quickly scan the var names to get to the one I want to change at that moment. If you align the equals signs yourself with spaces, you can use variable names of different lengths (and possibly improved meaningfulness in actual factual code) too. I'm only half-joking. Vertical alignment makes a dramatic difference to readability. -- c Speaking of alignement, my understanding is that the .[] operator, allows spacing while [] does not: $a = @shortnm.[0 ]; $b = @longername .[42]; -- stef
Re: MMD and VTABLE_find_method
Sam Ruby wrote: Leopold Toetsch wrote: The caller sets: mmd_flag := NULL ... no MMD, plain method lookup mmd_flag := depth ... return the next matching method starting at the given parent search depth In the general case, how does the caller know that MMD is invoked? Perl6 multi subs are denoted with the multi keyword. We need some extensions to pdd03 that pass this information on to Parrot. It basically boils down to a new opcode: call_MMD method, n as described in subjects MMD: more implications and MMD dispatch A cache invalidation function is called from add_method (and remove_method) which resets entries for the passed class. And, in some languages, all calls to set_attr or setprop type methods, where the value invoked may be invokable, or might obscure visibility to one that is. As calls to setting attributes/properties are frequent, my concern is that this may more than wipe out any potential benefit that such a cache may provide. You don't have operator overloading implemented in py*, do you? Anyway the code generator emits: add Px, Py, Pz Now some attribute set operations on the class, metaclass or in the __dict__ can mean an overloading of the __add__ method of CPy. To handle that correctly, you can either not emit an add opcode in the first place, or you have to track the attribute set operations so that you are able to call the user-provided __add__ method. You can of course in the current scheme install an add MMD method that does always a full method lookup, but then you got the performance problem you are worrying about. Also, note that the Perl sub defined above is not a method. Yes. But Perl6 allows multi subs to be called as methods on the first invocant too: $a.foo($b, $c) := foo($a, $b, $c) Comments welcome, Counter-proposal. I see no reason why a full multi-dimensional multi-method dispatch PMC could not commence immediately, complete with a fully-functional polymorphic inline cache. Once it is ready and tested, we can explore setting things up so that the various mmd_dispatch_* functions to exploit this functionality for the existing predefined binary operations. I don't see how this solves anything, except that you seem to be moving the burden of MMD to an additional PMC. What does this proposed MMD PMC do? How does it find the appropriate multi-method? I've described a versatile MMD scheme that is able to do n-dimensional MMD. Counter-proposals are very welcome, but the proposal has to include the mechanism how it works. A MMD PMC that does it is too thin, sorry. - Sam Ruby leo
Re: Auto My?
On Sun, Dec 19, 2004 at 08:25:58PM -0600, Rod Adams wrote: : Another facet of this discussion comes into account when also specifying : type. : : from S9: : my bit @bits; : my int @ints; : my num @nums; : my int4 @nybbles; : my str @buffers; : my ref[Array] @ragged2d; : my complex128 @longdoublecomplex; : : Wouldn't this be much better as: : bit @bits; : int @ints; : num @nums; : int4 @nybbles; : str @buffers; : ref[Array] @ragged2d; : complex128 @longdoublecomplex; : : Given that most of the stated reservations had to deal with explicit : declaration better defining scope, what is wrong with drooping the my in : this case? How 'bout ambiguity with unary ops like int and ref, or any other unaries anyone ever decides to have that might conflict with type names. Plus it's just visually confusing. Making it hard to tell declarations from statements is one of the areas where C made a big mistake, and I am not at all tempted to repeat it. Larry
Re: auxiliary variables
At 12:00 AM +0100 12/20/04, [EMAIL PROTECTED] wrote: Please Lets have two scalars variables in Perl and some operation under them like an adding. x = a + b I would like know, witch auxiliary variables are creating on the in-line code like a Parrot somethink like T = a + b x = T ??? For simple expressions there's no need for temps. x = a + b translates to add x, a, b. If you have more complex expressions and need temps, then generally the compiler will choose the correct temp type, since it's normally language dependent. (Though Parrot's Undef is generally clever enough to be a good generic destination, as it morphs to most destination types on assign) -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: 700 tests for PGE
Patrick R. Michaud wrote: I'm getting errors from the test script itself, there are lines such as # 65: ^ abc y $ p6rule_like('abc', '^', qr/0: \Q\E @/, 're_tests 35/$0 (#35)'); # 66: $ abc y $ p6rule_like('abc', '$', qr/0: \Q\E @/, 're_tests 36/$0 (#36)'); which cause my test script to produce... $ perl t/harness t/p6rules/re_tests.t t/p6rules/re_testssyntax error at t/p6rules/re_tests.t line 106, near 0: \Q\E @ syntax error at t/p6rules/re_tests.t line 108, near 0: \Q\E @ syntax error at t/p6rules/re_tests.t line 236, near 1: \Q\E @ ... I suppose the problem could be with my perl installation (5.8.0, RH9) I don't get those errors with 5.8.3, SuSe 9.1 Personal If those are caused by '@/' which seems a bit like array, then it would be easily fixed by addind space between @ and /. Other 'special' syntax which I invented was for tests which only test a position, not value: # 131: ()ef def y $-[0] 1 # 132: ()ef def y $+[0] 3 # SKIP p6rule_like('def', '()ef', qr/0: .* @ 1/, 're_tests 97/$0 (#98)'); # 133: ()ef def y $-[1] 1 # 134: ()ef def y $+[1] 1 # SKIP p6rule_like('def', '()ef', qr/1: .* @ 1/, 're_tests 98/$1 (#99)'); Also, the PGE test harness itself isn't sacred -- if there are functions or other features we could add to make this sort of testing easier, I'm all for it. Currently the script is written in such a way that it's trivial to output tests in different syntax, should that be changed. I was wondering if it would make sense to add the original 're_tests' file to parrot distribution, with a script which autogenerates 're_tests.t' from it. This way it would be possible to update the script if testing-format is changed, or if some bigger mistakes are seen. Of course small errors in 're_tests.t' file could be fixed manually, but if testing-format it changed, then those changes would be lost when file was autogenerated again. -- Markus Laire Jam. 1:5-6
When converting tests...
How should I convert dot, $, ... Some examples: 1# p5: abc =~ /a.c/; (match) 2# p5: a\nc =~ /a.c/; (no match) Equivalent code for '.' would now be '\N'. Still there are tests where I could just leave the dot alone (e.g. all tests where there is no \n in target-string.) In test-1 I could leave dot alone to get a test which tests similar concept (dot matching single char). In test-2 I must change dot to \N to retain the idea of the test. Still I could just change dot to \N in both tests. But then I wouldn't get a test for a dot... Similar problem exists for $. When there is no //m modifier, and no \n in the end, I could just keep $. When there is \n in the end (but no //m), I could change $ to be \n?$ Of course I could change $ to be \n?$ even when there is no \n in the string. But should I? So should I convert items like dot or $ depending on the string I know test will match the rule against (like whether it contains \n or not) - or should I convert these items allways in the same way. (Of course once perl6-rules starts working a lot better than now, we anyway need totally new tests to consider all the new possibilities.) -- Markus Laire Jam. 1:5-6
Re: 700 tests for PGE
On Mon, Dec 20, 2004 at 01:59:57PM +0200, Markus Laire wrote: I was wondering if it would make sense to add the original 're_tests' file to parrot distribution, with a script which autogenerates 're_tests.t' from it. This way it would be possible to update the script if testing-format is changed, or if some bigger mistakes are seen. Of course small errors in 're_tests.t' file could be fixed manually, but if testing-format it changed, then those changes would be lost when file was autogenerated again. After thinking about this a bit... Although it's probably worthwhile for us to find a good way to encode p6rule tests in a way that allows us to go to other formats in the future, I'm not so sure we buy a lot in being able to do that for the original 're_tests'. As I'll note in my next message, the semantics of regexs and rules are just different enough that I don't think that a one-to-one correspondence for 're_tests' is worth trying to preserve or code around. At some point I think it's just better to use the autogenerating script to create re_tests.t, manually fix re_tests.t for the things that don't convert, and then go from there. Plus, I suspect that p5's 're_tests' is mature enough that it doesn't change often, and the places where it might change in the future are esoteric enough that it'd be easier to add+translate those changes manually than to update the conversion script. Note that for this I'm only talking about ease of building and maintaining the p6rules test suite. Having a script to automatically convert any arbitrary p5 regex to its p6 equivalent *is* important, useful, and definitely worth pursuing, but I'm not sure it's worth the effort (at this stage) to worry about keeping the p6rules test suite consistent with that. And the p5-p6 regex converter will probably deserves its own test suite, once p6rules is sufficiently advanced. Pm
Re: When converting tests...
On Mon, Dec 20, 2004 at 05:27:37PM +0200, Markus Laire wrote: How should I convert dot, $, ... 1# p5: abc =~ /a.c/; (match) 2# p5: a\nc =~ /a.c/; (no match) Equivalent code for '.' would now be '\N'. Still there are tests where I could just leave the dot alone (e.g. all tests where there is no \n in target-string.) These are cases where I think the p5 tests each need to become multiple tests in the p6rules suite. The above tests probably should go into the p6rules suite as abc /a.c/ (match) a\nc /a.c/ (match) abc /a\Nc/ (match) a\nc /a\Nc/ (no match) so that we get all of the cases implied by #1 and #2 above. I think this also points to why we might want to just do as much autoconverting of 're_tests' as we can at the beginning, and then decide we've done enough and maintain things manually from there. Pm
Re: Let the hacking commence!
A few initial questions/comments on some small things -- I'll get to the bigger constructs a bit later. I'm an outside-in designer, so I tend to work on the macro and micro levels until I meet in the middle. rule identifier() { alpha \w* } Does Perl 6 allow leading underscores in identifiers? If so, shouldn't this be rule identifier() { +alpha+[_] \w* } ? rule open_expression_grouping() { \( } rule close_expression_grouping() { \) } rule open_argument_list() { \( } rule close_argument_list() { \) } I'm not sure I agree with expression_grouping being defined in this way-- it seems to me that parens (and brackets and braces and dots) are being treated as operators (S03, S04), perhaps even postcircumfix operators if I understand what that means (A12). So we need to be a bit careful here. In addition to reviewing what's been done so far, I'll take a stab at writing the rules for P6 rules. :-) Pm
Re: MMD and VTABLE_find_method
Leopold Toetsch wrote: Sam Ruby wrote: Leopold Toetsch wrote: The caller sets: mmd_flag := NULL ... no MMD, plain method lookup mmd_flag := depth ... return the next matching method starting at the given parent search depth In the general case, how does the caller know that MMD is invoked? Perl6 multi subs are denoted with the multi keyword. We need some extensions to pdd03 that pass this information on to Parrot. It basically boils down to a new opcode: call_MMD method, n as described in subjects MMD: more implications and MMD dispatch My question was: how does the caller make such a determination? Yes, in Perl 6, the multi subs are defined with the multi keyword. However, from http://www.perl.com/pub/a/2004/04/16/a12.html?page=10: Whenever you make a call using subroutine call syntax, it's a candidate for multiple dispatch. I read this to mean that the *caller* does nothing to distinguish between calls to single dispatch subroutines from multiple dispatch subroutines. So... how does one determine at compile time which opcode to use? A cache invalidation function is called from add_method (and remove_method) which resets entries for the passed class. And, in some languages, all calls to set_attr or setprop type methods, where the value invoked may be invokable, or might obscure visibility to one that is. As calls to setting attributes/properties are frequent, my concern is that this may more than wipe out any potential benefit that such a cache may provide. You don't have operator overloading implemented in py*, do you? Anyway the code generator emits: add Px, Py, Pz Now some attribute set operations on the class, metaclass or in the __dict__ can mean an overloading of the __add__ method of CPy. To handle that correctly, you can either not emit an add opcode in the first place, or you have to track the attribute set operations so that you are able to call the user-provided __add__ method. You can of course in the current scheme install an add MMD method that does always a full method lookup, but then you got the performance problem you are worrying about. The overloading functionality has been added for a number of methods, but not yet for __add__. I've been adding methods one at a time based on the existence of test cases. Classes like PyString are primitive, and opcodes like get_iter directly access the vtable. For classes written completely in PIR, the vtable entry for get_iter causes __iter__ methods to be invoked. Note: this decision is made at runtime, not at compile time. Instead of pessimistically assuming that all such invocations will require a method lookup, this decision is deferred to the appropriate implementation of VTABLE_get_iter. Classes written in PIR but inherit from primitive classes employ a proxy, analogous to the delegate class, but in reverse. Note: proxies are only created if there is such a mix of PIR and NCI involved. All of this is taken care of by the Cinvoke methods of PyType and PyProxyType, the compiler is unaware of these details. Also, note that the Perl sub defined above is not a method. Yes. But Perl6 allows multi subs to be called as methods on the first invocant too: $a.foo($b, $c) := foo($a, $b, $c) Given the latter syntax, how does the compiler know when to emit a callmethodcc foo and when to emit a call_MMD foo? Comments welcome, Counter-proposal. I see no reason why a full multi-dimensional multi-method dispatch PMC could not commence immediately, complete with a fully-functional polymorphic inline cache. Once it is ready and tested, we can explore setting things up so that the various mmd_dispatch_* functions to exploit this functionality for the existing predefined binary operations. I don't see how this solves anything, except that you seem to be moving the burden of MMD to an additional PMC. What does this proposed MMD PMC do? How does it find the appropriate multi-method? I've described a versatile MMD scheme that is able to do n-dimensional MMD. Counter-proposals are very welcome, but the proposal has to include the mechanism how it works. A MMD PMC that does it is too thin, sorry. The only thing I am attempting to solve is the presumption that MMD calls can be detected at compile time. Unless you can describe a mechanism which enables the callers to detect at compile time whether they are invoking a MMD subroutine or not, this code needs to be either executed as a part of an VTABLE_invoke. Again, I am not suggesting that the algorithm be made any thinner, in fact, I am not suggesting any change to the algorithms that you have described. I am merely suggesting where the logic needs to be placed. - Sam Ruby
[perl #33129] N registers get whacked in odd circumstances
# New Ticket Created by Dan Sugalski # Please include the string: [perl #33129] # in the subject line of all future correspondence about this issue. # URL: http://rt.perl.org:80/rt3/Ticket/Display.html?id=33129 I'm finding that the N registers are getting messed up with some function calls, though I can't pin down exactly which ones yet. (Working on that) However, here's a trace of one of the cases where they do get messed up: 167 set N5, N15 - N5=411.40, N15=22253.00 170 set I0, 1 - I0=1, 173 set I1, 0 - I1=0, 176 set I2, 0 - I2=0, 179 set I3, 0 - I3=0, 182 set I4, 1 - I4=0, 185 returncc # Back in sub '_MAIN' *** switching to BYTECODE_reports/sorep.imc # Calling meth '__set_number_native' # in file '(unknown file)' near line -1 # Calling sub '__set_number_native' # in file '(unknown file)' near line -1 *** switching to BYTECODE_classes/Money8.imc 303 set N30, N5 - N30=219.00, N5=-21841.60 306 interpinfo P30, 16 - P30=PMCNULL, 309 classoffset I30, P30, Money8 - I30=6, P30=Object(Money8)=PMC(0x413da718), 313 getattribute P15, P30, I30 - P15=PMCNULL, P30=Object(Money8)=PMC(0x413da718), I30=6 317 mul N30, N30, 100 - N30=-21841.60, N30=-21841.60, 321 floor N30, N30 - N30=-2184160.00, N30=-2184160.00 324 set P15, N30- P15=Integer=PMC(0x413d99f8), N30=-2184160.00 327 set I0, 1 - I0=1, 330 set I1, 0 - I1=0, You'll note that N5 is set to 22253 when the returncc's done, but after the return the value is -21814.6. Looks like something's stomping the N registers. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Perl 6 Summary for 2004-12-06 through 2004-12-20
Perl 6 Summary for 2004-12-06 through 2004-12-20 All~ The observant among you might notice that I missed last week's summary. With the hubbub and confusion of the holidays, I blame ninjas, in particular Ryu Hyabusa. Given that Christmas is next weekend and New Years is the weekend after that, what you are like to see in the future are a pair of 10 day summaries are some other equally irregular pattern. If you are thinking of using the dates of my summaries to seed a random number generator, I would advise against it as I can be really easily bought ;-) Without more ado, I give you this fortnight's summary starting with Perl 6 Language Lexical scope of parametric declaration blocks Ashley Winters wanted to know what the differences between type parameter lists and sub parameter lists. Luke Palmer could not think of any. http://xrl.us/ef7f object representation Abhijit Mahabal noticed that S12 allowed one to supply an object layout to bless() and wondered if one could really have two instances of the same class with different layouts. Larry admitted that he had probably not intended for that to be the case. http://xrl.us/ef7g capturing into a hash, hypothetically Patrick R. Michaud wondered about capturing things into a hash in S05, as ident now captures. Larry admitted that it was probably supposed to be («ident»), but also noticed that this exposed a blind spot in the design. He went on to ruminate about this blind spot and ways to solve it. Much churning went on and it seems that multiple different (but identically named) rule captures can now be performed by adding information after a dash ala ws-1 ws-2 ws-3. http://xrl.us/ef7h custom subscripting When talking about key Type for a hash, Larry offhandedly commented about attaching a block to a hash or array to provided custom subscripting. Many people drooled over the awesome syntactic sugar this could provide them. http://xrl.us/ef7i undeclared attributes Dave Whipp hoped that he need not predeclare his attributes as they necessarily start with $. the fact that a new variable is an attribute is easy to determine. Abhijit Mahabal thought that it would not be a good idea, but then asked if classes could be declared as not strict. Still waiting for more official word... http://xrl.us/ef7j classes which autovivify attributes Abhijit Mahabal wondered about creating a class that populates it attributes on demand, as some of them might be rarely used. Larry suggested that it would be something that one should not undertake lightly and a simple hash attribute would provide most of what is wanted. This also morphed into the eternal debate about strictures and one liners. There has to be a joke in there somewhere A stricture, a one-liner, and Larry Wall walk into a bar... http://xrl.us/ef7k auto my Rod Adams wondered if having my occur automatically for new variables might be worthwhile. Several people commented that some languages already do this and it is simply an aesthetic choice. The concensus seems to be that Perl has already made this choice and is sticking with its answer. http://xrl.us/ef7m Perl 6 Compiler At long last google has picked up P6C, I guess I have slightly mixed emotions about this as it takes a running gag from me. Alas, I will have to find another. PGE tests Markus Laire began working on a formerly small script to convert perl 5's regex tests to PGE. He produced a modest 700 tests a few of which pass. Nice work. Patrick suggested only running the script once and thereafter maintaining the tests external to perl5. http://xrl.us/ef7n -- initial post http://xrl.us/ef7o -- Patrick's suggestion http://xrl.us/ef7p -- how to deal with abiguity converting On your marks, get set, HACK! Luke Palmer opened the door to hacking and has requested rules for parts of the Perl 6 Grammar. Patrick posted a link to the SVN repository for it. http://xrl.us/ef7q https://svn.perl.org/perl6 Parrot \0namespace Leo commited a fix to support namespace mangling. http://xrl.us/ef7r store global = invalidate method cache Leo commited a fix to invalidate the method cache when a global is stored. http://xrl.us/ef7s pow, hash, batman sound effect! Leo added pow and hash as vtables and opcodes. He also renamed new_extended to instantiate. http://xrl.us/ef7t base scalar semantics Leo asked for comments about base PMC semantics and receive none. http://xrl.us/ef7u split now independant of Perl James deBoer provided a patch removing the dependancy on Perl Array in split. Will applied it. http://xrl.us/ef7v SVN Periodically every project project has a thread about
Test::Builder versus Unicode
Hi all, The following code: use utf8; use diagnostics; BEGIN {binmode STDOUT, ':utf8';} use Test::More tests = 1; # those are smart quotes diag This is a \x{201c}test\x{201d}; ok 1; Produces the following error message: 1..1 Wide character in print at /usr/local/lib/perl5/5.8.5/Test/Builder.pm line 1005 (#1) (W utf8) Perl met a wide character (255) when it wasn't expecting one. This warning is by default on for I/O (like print). The easiest way to quiet this warning is simply to add the :utf8 layer to the output, e.g. binmode STDOUT, ':utf8'. Another way to turn off the warning is to add no warnings 'utf8'; but that is often closer to cheating. In general, you are supposed to explicitly mark the filehandle with an encoding, see open and perlfunc/binmode. # This is a test ok 1 And looking at line 1005: sub _print_diag { my $self = shift; local($\, $, $,) = (undef, ' ', ''); my $fh = $self-todo ? $self-todo_output : $self-failure_output; print $fh @_; # here there be smart quotes } There are a few strange paths in the code which could be causing this (I'm wondering about the autoflush), but I was wondering if anyone has seen this and knows how to cope with it? As you can see, I've tried that standard binmode ':utf8' and using utf8, but to no avail. Cheers, Ovid = Silence is Evil http://users.easystreet.com/ovid/philosophy/decency.html Ovid http://www.perlmonks.org/index.pl?node_id=17000 Web Programming with Perl http://users.easystreet.com/ovid/cgi_course/
Help redo the Phalanx 100
=head1 Announcing the Do-It-Yourself Phalanx 100! The Phalanx 100 is a list of the top 100 modules on CPAN, and by extension, those that should have the most attention paid to them by the Phalanx project. The first time I generated the P100 was over a year ago, and things are old and stale. Distributions have changed names (CGI::Kwiki is now Kwiki, for example). Some distros have come and some have gone. It's time to be updated. This time, YOU can help determine the P100. The source data, generated from logs from the main CPAN mirror at pair.com, is available for download at Lhttp://petdance.com/random/cpan-gets.gz. Write code that analyzes the data, and generates the top 100 modules. What should your code do? It's up to you! Publish the code somewhere (use.perl.org, perlmonks, whatever) and let me see it. I'm not sure if I'll take someone's decisions directly, or use ideas, or how I'll do it, but the more working code I have to pick from, the better. Also, the last time I created a P100, I omitted any modules that were in the core distribution. This time, I do want to include core modules, although I do want to have them noted somehow. Richard Clamp's CModule::CoreList will be a great help with this. Whatever you do, however you do it, I need to know about your code no later than January 10th, 2005. Email me at Candy at petdance.com. There's going to be an article about the Phalanx project going up on perl.com soon after that, and I need to have an updated version of the P100 up (replacing Lhttp://qa.perl.org/phalanx/distros.html) by then. =head2 About the data I used the following code to analyze data from the Apache logs for the main CPAN mirror at Pair.com from November 1 to December 15th, 2004. #!/usr/bin/perl use strict; use warnings; my %id; my $next_id = 1; while () { next unless m!^\S+ (\S+) .+ GET ([^]+) HTTP/\d\.\d 200!; my ($ip,$path) = ($1,$2); study $path; # Skip directories next if $path =~ /\/$/; # Directory next if $path =~ /\/\?/;# Directory with sort parms # Skip certain directories next if $path =~ /^\/(icons|misc|ports|src)\//; # Skip certain file extensions next if $path =~ /\.(rss|html|meta|readme)$/; # Skip CPAN distro maintenance stuff next if $path =~ /CHECKSUMS$/; next if $path =~ /MIRRORING/; # Module list stuff next if $path =~ /\Q00whois./; next if $path =~ /\Q01mailrc./; next if $path =~ /\Q02packages.details/; next if $path =~ /\Q03modlist./; my $id = ($id{$ip} ||= ++$next_id); print $id $path\n; } This gives lines like this: 16395 /authors/id/K/KE/KESTER/WWW-Yahoo-DrivingDirections-0.07.tar.gz 10001 /authors/id/K/KW/KWOOLERY/Buzznet-API-0.01.tar.gz 85576 /authors/id/J/JR/JROGERS/Net-Telnet-3.01.tar.gz 85576 /authors/id/J/JR/JROGERS/Net-Telnet-3.02.tar.gz 85576 /authors/id/J/JR/JROGERS/Net-Telnet-3.03.tar.gz The 5-digit number is an ID number for a given IP address. I found that some IPs were routinely slurping down entire histories of modules, which probably will skew statistics to those with a lot of revisions. How should these be accounted for in the analysis? I don't know. That's one of the reasons that I put this out for all to work on. I welcome your comments, suggestions and help on this. xoxo, Andy -- Andy Lester = [EMAIL PROTECTED] = www.petdance.com = AIM:petdance
Help update the Phalanx 100
=head1 Announcing the Do-It-Yourself Phalanx 100! The Phalanx 100 is a list of the top 100 modules on CPAN, and by extension, those that should have the most attention paid to them by the Phalanx project. The first time I generated the P100 was over a year ago, and things are old and stale. Distributions have changed names (CGI::Kwiki is now Kwiki, for example). Some distros have come and some have gone. It's time to be updated. This time, YOU can help determine the P100. The source data, generated from logs from the main CPAN mirror at pair.com, is available for download at Lhttp://petdance.com/random/cpan-gets.gz. Write code that analyzes the data, and generates the top 100 modules. What should your code do? It's up to you! Publish the code somewhere (use.perl.org, perlmonks, whatever) and let me see it. I'm not sure if I'll take someone's decisions directly, or use ideas, or how I'll do it, but the more working code I have to pick from, the better. Also, the last time I created a P100, I omitted any modules that were in the core distribution. This time, I do want to include core modules, although I do want to have them noted somehow. Richard Clamp's CModule::CoreList will be a great help with this. Whatever you do, however you do it, I need to know about your code no later than January 10th, 2005. Email me at Candy at petdance.com. There's going to be an article about the Phalanx project going up on perl.com soon after that, and I need to have an updated version of the P100 up (replacing Lhttp://qa.perl.org/phalanx/distros.html) by then. =head2 About the data I used the following code to analyze data from the Apache logs for the main CPAN mirror at Pair.com from November 1 to December 15th, 2004. #!/usr/bin/perl use strict; use warnings; my %id; my $next_id = 1; while () { next unless m!^\S+ (\S+) .+ GET ([^]+) HTTP/\d\.\d 200!; my ($ip,$path) = ($1,$2); study $path; # Skip directories next if $path =~ /\/$/; # Directory next if $path =~ /\/\?/;# Directory with sort parms # Skip certain directories next if $path =~ /^\/(icons|misc|ports|src)\//; # Skip certain file extensions next if $path =~ /\.(rss|html|meta|readme)$/; # Skip CPAN distro maintenance stuff next if $path =~ /CHECKSUMS$/; next if $path =~ /MIRRORING/; # Module list stuff next if $path =~ /\Q00whois./; next if $path =~ /\Q01mailrc./; next if $path =~ /\Q02packages.details/; next if $path =~ /\Q03modlist./; my $id = ($id{$ip} ||= ++$next_id); print $id $path\n; } This gives lines like this: 16395 /authors/id/K/KE/KESTER/WWW-Yahoo-DrivingDirections-0.07.tar.gz 10001 /authors/id/K/KW/KWOOLERY/Buzznet-API-0.01.tar.gz 85576 /authors/id/J/JR/JROGERS/Net-Telnet-3.01.tar.gz 85576 /authors/id/J/JR/JROGERS/Net-Telnet-3.02.tar.gz 85576 /authors/id/J/JR/JROGERS/Net-Telnet-3.03.tar.gz The 5-digit number is an ID number for a given IP address. I found that some IPs were routinely slurping down entire histories of modules, which probably will skew statistics to those with a lot of revisions. How should these be accounted for in the analysis? I don't know. That's one of the reasons that I put this out for all to work on. I welcome your comments, suggestions and help on this. Thanks, xoxo, Andy -- Andy Lester = [EMAIL PROTECTED] = www.petdance.com = AIM:petdance
Re: Test::Builder versus Unicode
On Mon, Dec 20, 2004 at 04:50:57PM -0800, Ovid wrote: And looking at line 1005: sub _print_diag { my $self = shift; local($\, $, $,) = (undef, ' ', ''); my $fh = $self-todo ? $self-todo_output : $self-failure_output; print $fh @_; # here there be smart quotes } There are a few strange paths in the code which could be causing this (I'm wondering about the autoflush), but I was wondering if anyone has seen this and knows how to cope with it? As you can see, I've tried that standard binmode ':utf8' and using utf8, but to no avail. For one, diag() goes to STDERR. But binmode'ing that doesn't work either. It must not survive the filehandle dup Test::Builder does. This shuts it up. use Test::Builder; BEGIN {my $fh = Test::Builder-new-failure_output; binmode $fh, ':utf8';} Test::Builder should do something like this internally, its not like anyone's going to drive binary data through a TB filehandle. The question is how does one do it without breaking older perls? -- Michael G Schwern [EMAIL PROTECTED] http://www.pobox.com/~schwern/ Once is a prank. Twice is a nuisance. But NINE TIMES is a TRADITION. -- Mark-Jason Dominus in [EMAIL PROTECTED]
Re: Test::Builder versus Unicode
On Mon, Dec 20, 2004 at 06:13:54PM -0800, David Wheeler wrote: Test::Builder should do something like this internally, its not like anyone's going to drive binary data through a TB filehandle. The question is how does one do it without breaking older perls? If there was a way to tell what mode was on STDERR before you duped it, you could just set it to the same. Something like: my $mode = what_binmode(STDERR); my $fh = $builder-failure_output; binmode $fh, $mode; Is there a module or function in Perl that can provide this information? Why does it matter what it was set to before? I'm always going to be shoving text out through this filehandle. -- Michael G Schwern [EMAIL PROTECTED] http://www.pobox.com/~schwern/ And God was pleased. And Dog was happy and wagged his tail. And Adam was greatly improved. And Cat did not care one way or the other. -- http://www.catsarefrommars.com/creationist.htm
Re: Test::Builder versus Unicode
On Dec 20, 2004, at 6:13 PM, David Wheeler wrote: If there was a way to tell what mode was on STDERR before you duped it, you could just set it to the same. Something like: my $mode = what_binmode(STDERR); my $fh = $builder-failure_output; binmode $fh, $mode; Is there a module or function in Perl that can provide this information? If not, another option is to add a binmode option to Test::Builder (and the modules that depend on it). So you could do something like this: use Test::More tests = 6, binmode = ':utf8'; Thoughts? Regards, David
Re: Test::Builder versus Unicode
On Mon, 2004-12-20 at 18:20 -0800, David Wheeler wrote: If not, another option is to add a binmode option to Test::Builder (and the modules that depend on it). So you could do something like this: use Test::More tests = 6, binmode = ':utf8'; Thoughts? I'd rather override Test::Builder::Output. Schwern, how's that refactoring we planned two years and a few months ago coming? Oh right. Yeah, me too. Sorry, -- c
Re: Test::Builder versus Unicode
On Mon, Dec 20, 2004 at 06:20:41PM -0800, David Wheeler wrote: If not, another option is to add a binmode option to Test::Builder (and the modules that depend on it). So you could do something like this: use Test::More tests = 6, binmode = ':utf8'; Thoughts? Again, this is not something the user should have to care about. Only text is shoved through those filehandles so setting them to handle Unicode should always be the right thing to do, unless it breaks an old perl. -- Michael G Schwern [EMAIL PROTECTED] http://www.pobox.com/~schwern/ I hate war as only a soldier who has lived it can, only as one who has seen its brutality, its stupidity. -- Dwight D. Eisenhower
Re: Test::Builder versus Unicode
On Dec 20, 2004, at 6:19 PM, Michael G Schwern wrote: Is there a module or function in Perl that can provide this information? Why does it matter what it was set to before? I'm always going to be shoving text out through this filehandle. It matters because if I'm using Big5 in my module, I *don't* want binmode set to :utf8, which is Perl's internal representation of UTF-8. I would want it set to :big5. Again, this is not something the user should have to care about. Only text is shoved through those filehandles so setting them to handle Unicode should always be the right thing to do, unless it breaks an old perl. Well, if that's the case, then the smarter thing might be to encode utf8 strings in Test::Builder before outputting them. You'd have to do something like this: print $fh map { $_ = Encode::encode_utf8($_) if Encode::is_utf8($_); $_ } @_; This should prevent the warning from happening. Regards, David
Re: Test::Builder versus Unicode
My Official Policy on this is now to let people who actually understand character encodings to work it out and just wait for a patch. PS Somebody should drag autrijus into this. -- Michael G Schwern [EMAIL PROTECTED] http://www.pobox.com/~schwern/ We don't know. But if we did, we wouldn't tell you.
Test::Legacy warnock'd
I've gotten absolutely no response about Test::Legacy. Is anybody using it? Anybody tried migrating old Test.pm based tests with it? -- Michael G Schwern [EMAIL PROTECTED] http://www.pobox.com/~schwern/ I'm crazy but I get the job done.