On Mar 8, 12:14 pm, Duncan Booth <duncan.bo...@invalid.invalid> wrote:
> You've totally missed the point. It isn't the size of the data you have > today that matters, it's the size of data you could have in several years' > time. > > Maybe today you've got 10 users each with 10 megabytes of data, but you're > aspiring to become the next twitter/facebook or whatever. It's a bit late > as you approach 100 million users (and a petabyte of data) to discover that > your system isn't scalable: scalability needs to be built in from day one. Do you have examples of sites that got big by planning their site architecture from day 0 to be big? Judging from published accounts, even Facebook and Twitter did not plan to be 'the next twitter/facebook'; each started with routine LAMP stack architecture and successfully re-engineered the architecture multiple times on the way up. Is there compelling reason to think the 'next twitter/facebook' can't and won't climb a very similar path? I see good reasons to think that they *will* follow the same path, in that there are motivations at both ends of the path for re-engineering as you go. When the site is small, resources commited to the backend are not spent on making the frontend useful, so business-wise the best backend is the cheapest one. When the site becomes super-large, the backend gets re-engineered based on what that organization learned while the site was just large; Facebook, Twitter, and Craigslist all have architectures custom designed to support their specific needs. Had they tried to design for large size while they were small, they would have failed; they couldn't have known enough then about what they would eventually need. The only example I can find of a large site that architected large very early is Google, and they were aiming for a market (search) that was already known to be huge. Its reasonable to assume that the 'next twitter/facebook' will *not* be in web search, social-networking, broadcast instant messaging, or classified ads, just because those niches are taken already. So whichever 'high-scalability' model the aspiring site uses will be the wrong one. They might as well start with a quick and cheap LAMP stack, and re-engineer as they go. Just one internet watcher's biased opinion... David www.rdbhost.com -> SQL databases via a web-service -- http://mail.python.org/mailman/listinfo/python-list