On Fri, 29 Oct 2021 at 00:46, yary <not....@gmail.com> wrote: > A small thing to begin with in the regex m/ ^ (@attributes) ':' \s (.+) > $ /; > m/ ^ (@attributes) ': ' (.*) $ /; >
Yes, nice cleanup. Thanks. > Next, how about adding a 2nd regex test similar to the "split" that also > relies on User ignoring unknown fields? This accepts an empty-string key, > which the "split" string handler does too. > > m/ ^ (<-[:]>*) ': ' (.*) /; > $ ./icheck.raku regex2 41391 entries by regex2 in 4.615332639 seconds Woh! That was surprising. The new regex is only about 2x slower than the "split" method. I did read on SO that someone claimed " longest-match alternation of the list's elements" is slow. But I thought the conclusion in the answers was that, in general, regex's are slow. Might have to test this example again on 2021.10 (not easy for me). >>> Results for rakudo-pkg-2021.9.0-01: >>> $ ./icheck.raku regex >>> 41391 entries by regex in 27.859560887 seconds >>> $ ./icheck.raku starts >>> 41391 entries by starts-with in 5.970667533 seconds >>> $ ./icheck.raku split >>> 41391 entries by split in 5.12252741 seconds >>> >>> Results for rakudo-pkg-2021.10.0-01 >>> $ ./icheck.raku regex >>> 41391 entries by regex in 27.833870158 seconds >>> $ ./icheck.raku starts >>> 41391 entries by starts-with in 2.560101599 seconds >>> $ ./icheck.raku split >>> 41391 entries by split in 2.307679407 seconds >>> >>> -------------------------------------------------- #!/usr/bin/env raku class User { has $.uid; has $.uidNumber; has $.gidNumber; has $.homeDirectory; has $.mode = 0; method attributes { # return <uid uidNumber gidNumber homeDirectory mode>; User.^attributes(:local)>>.name>>.substr(2); # Is the order guaranteed? } } # Read user info from LDIF file my %ldap; my @attributes = User.attributes; multi MAIN ( "regex", $ldif-fn = "db/icheck.ldif" ) { my ( %f ); for $ldif-fn.IO.lines -> $line { when not $line { # blank line is LDIF entry terminator %ldap{%f<uid>} = User.new( |%f ); } when $line.starts-with( 'dn: ' ) { %f = () } # dn: starts a new entry next unless $line ~~ m/ ^ (@attributes) ': ' (.*) $ /; %f{$0} = "$1"; } say "{%ldap.elems} entries by regex in {now - BEGIN now} seconds"; } multi MAIN ( "regex2", $ldif-fn = "db/icheck.ldif" ) { my ( %f ); for $ldif-fn.IO.lines -> $line { when not $line { # blank line is LDIF entry terminator %ldap{%f<uid>} = User.new( |%f ); } when $line.starts-with( 'dn: ' ) { %f = () } # dn: starts a new entry next unless $line ~~ m/ ^ (<-[:]>*) ': ' (.*) $ /; %f{$0} = "$1"; } say "{%ldap.elems} entries by regex2 in {now - BEGIN now} seconds"; } multi MAIN ( "starts", $ldif-fn = "db/icheck.ldif" ) { my ( %f ); for $ldif-fn.IO.lines -> $line { when not $line { # blank line is LDIF entry terminator %ldap{%f<uid>} = User.new( |%f ); } when $line.starts-with( 'dn: ' ) { %f = () } # dn: starts a new entry for @attributes -> $a { if $line.starts-with( $a ~ ": " ) { %f{$a} = (split( ": ", $line, 2))[1]; last; } } } say "{%ldap.elems} entries by starts-with in {now - BEGIN now} seconds"; } multi MAIN ( "split", $ldif-fn = "db/icheck.ldif" ) { my ( %f, $k, $v ); for $ldif-fn.IO.lines -> $line { when not $line { # blank line is LDIF entry terminator %ldap{%f<uid>} = User.new( |%f ); # attributes not used are ignored } when $line.starts-with( 'dn: ' ) { %f = () } # dn: starts a new entry ($k, $v) = split( ": ", $line, 2); %f{$k} = $v; } say "{%ldap.elems} entries by split in {now - BEGIN now} seconds"; } -- Norman Gaywood, Computer Systems Officer School of Science and Technology University of New England Armidale NSW 2351, Australia ngayw...@une.edu.au http://turing.une.edu.au/~ngaywood Phone: +61 (0)2 6773 2412 Mobile: +61 (0)4 7862 0062 Please avoid sending me Word or Power Point attachments. See http://www.gnu.org/philosophy/no-word-attachments.html