"Niels Poppe" <[EMAIL PROTECTED]>
11/18/02 01:09 PM
To: <[EMAIL PROTECTED]>
cc: <[EMAIL PROTECTED]>
Subject: Re: [OT] Unicode vs URI escaping
Ok, ok, i'm not offended
And even this last unpack/map/join try gets beaten largely in terms of
speed
if tested with strings of any length > 4. But the point was to get correct
output independent of perl version or context. For speed, make sure to
'use
bytes' or 'no utf8' in those escape functions, otherwise things might
break,
which was the reason the whole issue came up on
[EMAIL PROTECTED] in the first place.
So, the 'could be improved' still stands, for an escape function that
works
under perl 5.00n and also under 5.6+ independent of use/no utf8 and/or
bytes, emits no warnings when run with -w and produces correct output when
fed UTF-8 strings containing character values larger than 255, faster then
the following:
my %ESCMAP = ();
for ( 0 .. 255 ) { $ESCMAP{ $_ } = sprintf("%%%02X", $_); }
for ('a'..'z', 'A'..'Z', '0'..'9', '_', '.', '-') {
$ESCMAP{ord($_)} = $_;
}
sub escape {
join '', map { $ESCMAP{$_} } unpack 'C*', shift
}
N.
Andy Bach, Sys. Mangler
Internet: [EMAIL PROTECTED]
VOICE: (608) 261-5738 FAX 264-5030
Wire telegraph is a kind of a very, very long cat. You pull his tail in
New York and his head is meowing in Los Angeles. And radio operatesexactly
the same way. The only difference is that there is no cat.
--Albert Einstein (explaining radio)