Hey Kevin,From seeing presentations from the HEP field (totally unrelated to Hadoop), I've seen folks claim the large instance is more than 4x better than the small, and less than 2x slower than extra-large. I.e., it provided that application the best bang for its buck.
In other words, you're not completely crazy for believing this, and other people have reported seeing non-linear differences between the difference instance types. I suspect the "best" will depend highly on what your app is doing.
Brian On Sep 29, 2009, at 12:19 PM, Kevin Peterson wrote:
Has anyone done any extensive testing of what instance types on Amazon EC2give you the most bang for the buck?Given the normal Hadoop recommendations of beefy machines, I would expect the best performance from the extra-large, but our testing showed otherwise. We did some rough testing while we were just getting started with like a 10 node cluster, and we found that the extra large instance doesn't come close to twice the actual performance of the large instance (pricing at $0.80 and $0.40). My rationalization is that some of the resources are shared, and the extra-large instance corresponds to the actual hardware, while the large instance sometimes gets to take advantage of IO and network bandwidth beyond50% when the other tenant isn't doing much.I'm revisiting our config because we're deploying HBase soon, and I'm not sure whether I would be better off going to the extra-large instances so that I can co-locate the tasktrackers and the region servers on the same nodes, or if I should stick with large instances and put hbase on separateservers. Mostly I'm wondering if my results were a fluke.
smime.p7s
Description: S/MIME cryptographic signature