Re: DateTime performance
On Thursday 03 May 2012 02:14:45 you wrote: > > From: Philipp K. Janert [mailto:jan...@ieee.org] > > Sent: Wednesday, 2 May 2012 8:29 AM > > > > Question: > > > > When using DateTime for a large number of > > instances, it becomes a serious performance > > drag. > > ... > > > Is this expected behavior? And are there access > > patterns that I can use to mitigate this effect? > > (I tried to supply a time_zone explicitly, but that > > does not seem to improve things significantly.) > > Hi Phillip, > > My #1 tip is to pre-prepare/cache the DateTime::TimeZone object and pass it > in to each creation of a DateTime object (via whatever mechanism you're > using to do that). I have seen a case where we were using time_zone => > 'local' in a reasonably tight datetime object creation loop and saw > significant speed increases just by cutting out that chunk of processing. > > In hindsight that was a silly thing to do but it became an easy win :-) > > I apologise if this is what you meant by supplying a time_zone explicitly > in your comment above. I have tried to specify the timezone explicitly as a string: $dt = DateTime->new( ..., time_zone => "America/Chicago" ) which does not seem to help, but I have not tried to do: $tz = DateTime::TimeZone( 'America/Chicago' ) $dt = DateTime->new( ..., time_zone => $tz ) I'll try that the next time I have to process one of my data sets again. ;-) Thanks for the hint. > > I can't recommend using a tool like NYTProf highly enough on a run of your > tool to spot the low hanging fruit. See > https://metacpan.org/module/Devel::NYTProf > > Cheers, > > Andrew
Re: DateTime performance
On Thursday 03 May 2012 02:10:04 you wrote: > On 2012.5.1 3:29 PM, Philipp K. Janert wrote: > > However, when working through a files with a few > > tens of millions of records, DateTime turns into a > > REAL drag on performance. > > > > Is this expected behavior? And are there access > > patterns that I can use to mitigate this effect? > > (I tried to supply a time_zone explicitly, but that > > does not seem to improve things significantly.) > > Unfortunately due to the way DateTime is architected it does a lot of > precalculation upon object instantiation which is usually not used. So > yes, it is expected in that sense. Ok. > > If all you need is date objects with a sensible interface, try > DateTimeX::Lite. It claims to replicate a good chunk of the DateTime > interface in a fraction of the memory. I'll check it out, thanks. > > Given how much time it takes to make a DateTime object, and your scale of > tens of millions of records, you could cache DateTime objects for each > timestamp and use clone() to get a new instance. I considered that, but in reality, most of my timestamps are actually different. (There are about 30M seconds in a year, so I won't have much duplication, looking at 10-50M records spread over a year...) > > sub get_datetime { > my $timestamp = shift; > > state $cache = {}; > > if( defined $cache->{$timestamp} ) { > return $cache->{$timestamp}->clone; > } > else { > $cache->{$timestamp} = > make_datetime_from_timestamp($timestamp); return $cache->{$timestamp}; > } > }