On Jul 16, 10:35 pm, Andy Freeman <ana...@earthlink.net> wrote: > > I'm starting to think that the > > "GAE takes > > care of the messy details of distributed systems programming" claim is > > a bit overstated... > > Global clock consistency requires very expensive clocks accessible > from every server with known latency (and even that's a bit dodgy). > AFAIK, GAE doesn't provide that, but who does? > > GAE doesn't do the impossible, but also doesn't say that it does. WRT > the latter, would you really prefer otherwise?
But that's just it -- in many places it's claimed that GAE makes it all a cakewalk. From the datastore docs: """ Storing data in a scalable web application can be tricky. A user could be interacting with any of dozens of web servers at a given time, and the user's next request could go to a different web server than the one that handled the previous request. All web servers need to be interacting with data that is also spread out across dozens of machines, possibly in different locations around the world. Thanks to Google App Engine, you don't have to worry about any of that. App Engine's infrastructure takes care of all of the distribution, replication and load balancing of data behind a simple API—and you get a powerful query engine and transactions as well. """ You could argue that that's not claiming to do the impossible, but "you don't have to worry about any of that" is certainly not true. Nowhere in the documentation is there a discussion of the kinds of subtle gotchas that you need to be aware of when programming for this kind of system. It's all just "golly isn't this so gosh-darn easy!" You have to go digging to find the article on transaction isolation where you find out that your queries can return results that, um, don't match your queries. And AFAICT you *do* have to worry about subsequent requests being handled by different servers, since there doesn't seem to be any guarantee that the datastore writes made in one request will be seen in the next. Memcache doesn't have transactions, so it seems like guaranteeing coherence with the datastore is tricky. I worked in a distributed systems group for many years, so I know that many of these problems are simply inherent to distributed systems. It doesn't disturb me that they exist. What bothers me is the way these issues are broadly *ignored* by GAE's documentation. If I wasn't a bit savvy about distributed systems I probably wouldn't have realized that clock skew could cause problems, and nothing I read in GAE's docs would have helped me figure it out. So no, I don't want GAE to claim to do the impossible, I want them to *stop* claiming to do the impossible. I would love to see some articles about the pitfalls of the system and how to avoid them or mitigate them. The transaction isolation article is great in that respect -- I hope people at Google are planning more along those lines. Cheers, -n8 --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---