On 2012.5.1 3:29 PM, Philipp K. Janert wrote: > However, when working through a files with a few > tens of millions of records, DateTime turns into a > REAL drag on performance. > > Is this expected behavior? And are there access > patterns that I can use to mitigate this effect? > (I tried to supply a time_zone explicitly, but that > does not seem to improve things significantly.)
Unfortunately due to the way DateTime is architected it does a lot of precalculation upon object instantiation which is usually not used. So yes, it is expected in that sense. If all you need is date objects with a sensible interface, try DateTimeX::Lite. It claims to replicate a good chunk of the DateTime interface in a fraction of the memory. Given how much time it takes to make a DateTime object, and your scale of tens of millions of records, you could cache DateTime objects for each timestamp and use clone() to get a new instance. sub get_datetime { my $timestamp = shift; state $cache = {}; if( defined $cache->{$timestamp} ) { return $cache->{$timestamp}->clone; } else { $cache->{$timestamp} = make_datetime_from_timestamp($timestamp); return $cache->{$timestamp}; } } -- 100. Claymore mines are not filled with yummy candy, and it is wrong to tell new soldiers that they are. -- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army http://skippyslist.com/list/