Ted - I can't really speak to the Perl library, but I personally have spent *lots* of time optimizing the Java libraries. I suspect that with some time put in, you'll be able to find a lot of room for improvement.
The next step is probably to create a JIRA ticket for Perl performance enhancements, and then get crackin'. Hopefully those with more Perl experience will show up to review and comment on any patches you create. -Bryan 2010/5/10 Ted Zlatanov <[email protected]> > Apologies if this has been discussed before but I didn't see it in the > archives. > > I see poor performance of any Perl code against Cassandra compared to > Java. I generally clock a 5-20x speed difference using the raw Thrift > API, depending on the number of structures that need to be > serialized/deserialized. This is with Perl 5.10 vs. the latest Sun JVM. > > I maintain the Net::Cassandra::Easy Perl module that uses this interface > so I'd like to make it faster. I think any performance improvements > would be good for all Thrift users so I am posting here in the hopes of > getting some feedback. > > It seems to me like one of the problems is the large number of OO method > calls, which in Perl are slower than function calls. Another is that > pack()/unpack() is probably the fastest way to serialize/deserialize data > in Perl, but it's not used much. Instead I see step-by-step > accumulation of values from the source data, which is suboptimal. In > Java this makes perfect sense but in Perl it drags performance down. > > Perhaps a good optimization would be to generate the pack/unpack format > strings at compilation time, combine them with static function wrappers, > and use that instead of multiple OO calls? Although I am comfortable > with Perl, I don't know Thrift well enough to recommend the best > approach there. I hope to be helpful with benchmarks and specific > optimizations, though. > > Thanks > Ted > >
