Hi All,

I just noticed that DBD::Pg uses POSIX, entirely for the purpose of
using isprint() to replace non-printable characters with their numeric
values. Here's what the code looks like:

  $str=join("", map { isprint($_)?$_:'\\'.sprintf("%03o",ord($_)) }
            split //, $str);

Now, this seemed rather silly to me. I couldn't imagine that it was
efficient, and I generally like to look for reasons to lose POSIX and
its bloat. This is what I came up with:

  $str =~ s/([^ -~])/'\\' . sprintf("%03o", ord($1))/ge;

I ran a little benchmark comparing them, and this is what the results
looked like:

Benchmark: timing 100000 iterations of Posix, Regex...
     isprint: 17 wallclock secs (15.64 usr +  0.01 sys = 15.65 CPU) @ 6389.78/s 
(n=100000)
     regex:   2 wallclock secs ( 2.35 usr +  0.00 sys =  2.35 CPU) @ 42553.19/s 
(n=100000)

A huge improvement, no? But I'm not sure that it's as efficient (or as
terse) as it really could be. Furthermore, the POSIX docs say, 

  Consider using regular expressions and the "/[[:isprint:]]/" construct
  instead.

However, I couldn't figure out how to use this constuct -- there's no
documentation on it that I can find. So the challenge is, can anyone
come up with an even better solution? The winner will be submitted to
the DBD::Pg maintainer (and the DBI list) as a patch.

Thanks!

David
-- 
David Wheeler                                     AIM: dwTheory
[EMAIL PROTECTED]                                 ICQ: 15726394
                                               Yahoo!: dew7e
                                               Jabber: [EMAIL PROTECTED]

Reply via email to