Hi Marc, I'd throw the hat in for MongoDB, its retardedly fast and I now adore it. Pop me a message on Twitter if you'd like to discuss it more.
Scott. On 27 Oct 2010, at 19:05, M. Edward (Ed) Borasky wrote: > Quoting Marc Mims <marc.m...@gmail.com>: > >> De-duplicating statuses in the Streaming API is fairly straightforward. >> But with Site Streams, where a single status might be received multiple >> times for multiple mentioned users, and/or as favorites, it is a bit >> more difficult. >> >> I'm wondering if anyone can offer advice on an efficient method for >> de-duplicating Site Streams. >> >> -Marc > > If you're talking about building something "massively scalable" for some > value of "massive", you're getting into the realm of "NoSQL" databases. I > *think* Cassandra has a Perl interface but I haven't looked at it recently. > I'm by no means an expert on NoSQL databases - I just picked Cassandra > because Twitter uses it for some things. -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk