On 2012.5.1 3:29 PM, Philipp K. Janert wrote:
> However, when working through a files with a few
> tens of millions of records, DateTime turns into a
> REAL drag on performance.
>
> Is this expected behavior? And are there access
> patterns that I can use to mitigate this effect?
> (I tried to supply a time_zone explicitly, but that
> does not seem to improve things significantly.)
Unfortunately due to the way DateTime is architected it does a lot of
precalculation upon object instantiation which is usually not used. So yes,
it is expected in that sense.
If all you need is date objects with a sensible interface, try
DateTimeX::Lite. It claims to replicate a good chunk of the DateTime
interface in a fraction of the memory.
Given how much time it takes to make a DateTime object, and your scale of tens
of millions of records, you could cache DateTime objects for each timestamp
and use clone() to get a new instance.
sub get_datetime {
my $timestamp = shift;
state $cache = {};
if( defined $cache->{$timestamp} ) {
return $cache->{$timestamp}->clone;
}
else {
$cache->{$timestamp} = make_datetime_from_timestamp($timestamp);
return $cache->{$timestamp};
}
}
--
100. Claymore mines are not filled with yummy candy, and it is wrong
to tell new soldiers that they are.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
http://skippyslist.com/list/