Re: RFC 283 (v1) C in array context should return a histogram

Paris Sinclair Tue, 26 Sep 2000 17:12:42 -0700

On Tue, 26 Sep 2000, Bennett Todd wrote:

> 2000-09-26-05:18:57 Paris Sinclair:
> > >     (%alphabet) = $string =~ tr/a-z//;
> > 
> > also a little more concise (and certainly more efficient...) than
> > 
> >     %alphabet = map { $_ => eval "\$string =~ tr/$_//" } (a..z);
> 
> However, compared to say
> 
>       $hist[ord($_)]++ for split //, $string;
> 
> the performance edge might not be quite so dramatic. Then again,
> maybe it would be, I dunno.

But would technique work with unicode? What if I am just counting some 
Bulgarian characters? Most encodings put these in the extended ascii
range. Making an array of 250 items for a count of 5 items isn't going to
be more efficient. Also, it requires jumping through more hoops, and doing
more conversions, to figure out which index is which letter. A table could
be built, but if it maps to an array index, based on ord(), then I
couldn't support both KOI-8 and windows cyrillic encodings in the same
@hist structure. Using a hash, the only limits are the more general
language supports in Perl, and I can still convert and store KOI8 and
cp1251, and store the results without needing to know which coding it
originated in; only needing to have a symbol for the character.

There seem to be lots of beneficial side effects of extending context,
that allow for general sollutions that are much more powerful than any of
the specific sollutions.

Paris Sinclair    |    4a75737420416e6f74686572
[EMAIL PROTECTED]    |    205065726c204861636b6572
www.sinclairinternetwork.com

Re: RFC 283 (v1) C in array context should return a histogram

Reply via email to