hi,
here is one simple script (Requires Parse::RecDescent) that count operators
in scripts.(and my fisrt grammar ;") )
OK. I started this against my current perl installation.
(it is not pure RH6.2 install, but many things are added)
i.e.

find /perl_dir -name *.pm | ./count.pl | tee allops.txt

it is little bit slower so try first on some PERL subdir not ROOT-Perl dir.
 # I can't figure out why \w+ OR \w+?\b didn't work so I use \S+
 # this is a bit slower 'cause action is executed on every
 # chunk text instead only on words
Add other delimers if used i.e. other than [,{,(,!,#,| for "q and re" stuff.

for THE results see at the END of the mail.

What is interesting to me :

1. "push" is used more than any of the other array ops, even than "shift"
2. "use" is very good candidate for speedup
3. We still use very much "goto" :")
4. "each" is used more than "values" and "keys"
5. Things like "hex,chr,oct,atan2" are used very rarely
6. "pack" and "unpack" are also used very rarely, "study" -
    the same number of times.

We can make similar thing for the whole CPAN.
What will this give to us :
1. It will help us to decide which of the operators are mostly used
   (CPAN is suitable for this) so then we can take care
   to speed up only mostly used ops in the new Perl6 (or Perl5)
   (current script doesn't care about the "weight" of the ops i.e.
    it doesn't count how many times any op will be used in REAL LIFE f.e
    some op may execute 10 times during the life of the module but other can
be
    executed only once. They are both counted as "ONE time" execution)
    Also how many times the module is dloaded from CPAN, has some meaning,
for
    making better calculation.
2. Will give us some better idea which of all outofcore-candidates can be
    easly purged from the CORE
3. Can focus our attention on which ops will be more problematic for
retaining
   compatibility.
   One very good idea for the Perl5->Perl6 TRANSLATION script to be used
module
   such as Parse::RecDescent.

The script count also the content of POD comments, which is bad.



#!/usr/bin/perl

use strict;
use Parse::RecDescent;
use vars qw{ @ops %ops $text $grammar };


sub loadfile($)
{
 open FILE, $_[0]  or return "-->";
 my $contents = <FILE>;
 close FILE;
 return $contents;
};

#============== GRAMMAR ===========================#
my $grammar = q{

start : op(s)
op    : stuff | /\S+/


         $::ops{$item[1]}++ if defined $::ops{$item[1]};
#      print "=$item[1]|\n"
        }

stuff  : qstuff | restuff

qstuff : m*q[qwxr]?[\[\{\(\|!/#]*
    { $::ops{qstuff}++   }
restuff : m*([ysm]|tr)[\[\{\(\|!/#]*
    { $::ops{restuff}++ }

};
#======================OPS=========================#
 #from perlfunc
@ops =

chomp chop chr crypt hex index lc lcfirst
length oct ord pack reverse
rindex sprintf substr  uc ucfirst
pos quotemeta  split study
abs atan2 cos exp hex int log oct rand
sin sqrt srand
pop push shift splice unshift
grep join map reverse sort unpack
delete each exists keys values
binmode close closedir dbmclose dbmopen die eof
fileno flock format getc print printf read
readdir rewinddir seek seekdir select syscall
sysread sysseek syswrite tell telldir truncate
warn write
pack read syscall sysread syswrite unpack vec
chdir chmod chown chroot fcntl glob
ioctl link lstat mkdir open opendir
readlink rename rmdir stat symlink umask
unlink utime
caller continue die do dump eval exit
goto last next redo return sub wantarray
caller import local my package use
defined dump eval formline local my reset
scalar undef wantarray
alarm exec fork getpgrp getppid getpriority kill
pipe setpgrp setpriority sleep system
times wait waitpid
do import no package require use
bless dbmclose dbmopen package ref tie tied
untie use
accept bind connect getpeername getsockname
getsockopt listen recv send setsockopt shutdown
socket socketpair
msgctl msgget msgrcv msgsnd semctl semget semop
shmctl shmget shmread shmwrite
endgrent endhostent endnetent endpwent getgrent
getgrgid getgrnam getlogin getpwent getpwnam
getpwuid setgrent setpwent
endprotoent endservent gethostbyaddr gethostbyname
gethostent getnetbyaddr getnetbyname getnetent
getprotobyname getprotobynumber getprotoent
getservbyname getservbyport getservent sethostent
setnetent setprotoent setservent
gmtime localtime time times
abs bless chomp chr exists formline glob
import lc lcfirst map my no prototype qx
qw readline readpipe ref sub sysopen tie
tied uc ucfirst untie use
dbmclose dbmopen
binmode chmod chown chroot crypt
dbmclose dbmopen dump endgrent endhostent
endnetent endprotoent endpwent endservent exec
fcntl flock fork getgrent getgrgid gethostent
getlogin getnetbyaddr getnetbyname getnetent
getppid getprgp getpriority getprotobynumber
getprotoent getpwent getpwnam getpwuid
getservbyport getservent getsockopt glob ioctl
kill link lstat msgctl msgget msgrcv
msgsnd open pipe readlink rename select semctl
semget semop setgrent sethostent setnetent
setpgrp setpriority setprotoent setpwent
setservent setsockopt shmctl shmget shmread
shmwrite socket socketpair stat symlink syscall
sysopen system times truncate umask unlink
utime wait waitpid
};

#======================MAIN=========================#
 #make ops HASH and remove repeated ops
map { $ops{$_} = 0 } @ops;
my $parser = new Parse::RecDescent ($grammar);
undef $/; $| = 1;
my $time1 = time;
 #parse all files passed
while (<>)
{
 foreach my $f (split /\n/)
  {
   print "File : $f.........\n";
   $text = loadfile($f);
   # study($text);
   # print $text;
   $parser->start($text);
  };
};
my $time2 = time;
 #print the result
my $i;
    #use other number if U want in comparison
map { $i++; print qq{$i. '$_' count :  $ops{$_}\n} if $ops{$_} > 0 }
    sort { $ops{$b} <=> $ops{$a} } keys %ops;
print "Time 2 : $time2\n";
print "Time 1 : $time1\n";
#======================END MAIN=========================#
__END__




=====RESULT=========================

'my' count :  5512
'sub' count :  4355
'return' count :  2502
'use' count :  2024
'qstuff' count :  1174
'print' count :  1122
'restuff' count :  949
'package' count :  690
'defined' count :  615
'push' count :  577
'warn' count :  423
'no' count :  392
'require' count :  389
'die' count :  363
'do' count :  329
'next' count :  309
'each' count :  230
'eval' count :  222
'values' count :  196
'local' count :  190
'last' count :  189
'shift' count :  189
'import' count :  171
'bless' count :  167
'open' count :  163
'keys' count :  155
'undef' count :  146
'time' count :  140
'read' count :  129
'ref' count :  127
'printf' count :  123
'close' count :  119
'delete' count :  119
'scalar' count :  118
'join' count :  117
'exists' count :  107
'system' count :  105
'split' count :  96
'write' count :  96
'map' count :  95
'goto' count :  89
'tie' count :  74
'format' count :  72
'socket' count :  68
'sort' count :  63
'reset' count :  62
'tied' count :  57
'unshift' count :  54
'length' count :  51
'pos' count :  49
'send' count :  49
'chdir' count :  46
'grep' count :  45
'times' count :  42
'index' count :  42
'link' count :  41
'unlink' count :  40
'wantarray' count :  39
'pop' count :  37
'tell' count :  34
'rename' count :  33
'glob' count :  31
'untie' count :  26
'accept' count :  26
'log' count :  23
'int' count :  23
'chmod' count :  22
'select' count :  22
'stat' count :  22
'exit' count :  21
'wait' count :  19
'sprintf' count :  19
'dump' count :  19
'sleep' count :  19
'lc' count :  19
'prototype' count :  17
'kill' count :  17
'binmode' count :  17
'continue' count :  17
'caller' count :  16
'pipe' count :  15
'substr' count :  15
'connect' count :  15
'mkdir' count :  14
'ord' count :  14
'fileno' count :  14
'exec' count :  13
'chop' count :  12
'truncate' count :  12
'qw' count :  12
'sqrt' count :  11
'umask' count :  11
'chomp' count :  11
'abs' count :  11
'reverse' count :  11
'splice' count :  10
'localtime' count :  10
'getc' count :  10
'sin' count :  10
'listen' count :  10
'alarm' count :  9
'uc' count :  9
'eof' count :  9
'waitpid' count :  9
'bind' count :  9
'fork' count :  8
'closedir' count :  8
'opendir' count :  8
'quotemeta' count :  8
'rmdir' count :  8
'unpack' count :  7
'exp' count :  7
'readdir' count :  7
'cos' count :  7
'gmtime' count :  7
'redo' count :  6
'seek' count :  6
'atan2' count :  6
'hex' count :  6
'rand' count :  6
'recv' count :  5
'syscall' count :  5
'semctl' count :  5
'utime' count :  5
'getpwuid' count :  5
'getpgrp' count :  5
'sysread' count :  5
'pack' count :  5
'readline' count :  5
'srand' count :  5
'chown' count :  5
'fcntl' count :  5
'lstat' count :  5
'getgrgid' count :  5
'getgrnam' count :  5
'ioctl' count :  5
'syswrite' count :  5
'rewinddir' count :  5
'study' count :  5
'getpwnam' count :  5
'getlogin' count :  4
'getppid' count :  4
'getprotobyname' count :  3
'seekdir' count :  3
'readlink' count :  3
'getgrent' count :  3
'flock' count :  3
'telldir' count :  3
'setsockopt' count :  3
'msgrcv' count :  3
'crypt' count :  3
'getpwent' count :  3
'msgget' count :  3
'semget' count :  3
'setpgrp' count :  3
'formline' count :  3
'getnetbyname' count :  3
'getservbyport' count :  3
'getsockopt' count :  3
'semop' count :  3
'getnetbyaddr' count :  3
'shmread' count :  3
'socketpair' count :  3
'shutdown' count :  3
'getservbyname' count :  3
'gethostbyname' count :  3
'shmget' count :  3
'dbmclose' count :  3
'shmctl' count :  3
'chroot' count :  3
'getprotobynumber' count :  3
'gethostbyaddr' count :  3
'msgctl' count :  3
'msgsnd' count :  3
'getservent' count :  3
'dbmopen' count :  3
'shmwrite' count :  3
'symlink' count :  3
'getpeername' count :  2
'ucfirst' count :  2
'sysopen' count :  2
'sysseek' count :  2
'oct' count :  2
'lcfirst' count :  2
'rindex' count :  2
'getpriority' count :  2
'vec' count :  2
'getprotoent' count :  2
'getsockname' count :  2
'chr' count :  2
'setpriority' count :  2
'endprotoent' count :  1
'gethostent' count :  1
'endservent' count :  1
'endpwent' count :  1
'endhostent' count :  1
'endnetent' count :  1
'setservent' count :  1
'endgrent' count :  1
'getnetent' count :  1
'sethostent' count :  1
'setgrent' count :  1
'setprotoent' count :  1
'setnetent' count :  1
'setpwent' count :  1

Reply via email to